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PREFACE 

The growing importance of the statistical method in physics is 
amply attested by the recent publication of several elaborate treatises 
emphasizing in their titles the words statistical mechanics. There 
have also lately appeared a number of other books containing detailed 
application of statistics to the properties of matter in the solid, liquid, 
and gaseous states. Most of these, though very useful to the special- 
ist, are likely to appear rather formidable to the student who is just 
embarking on graduate study and who wishes a thorough but not too 
lengthy introduction to the method of statistical physics. The author 
has tried to provide this in the present work, which is intended for 
readers equipped with an introductory background in theoretical 
physics. 

In this book the attempt has been made to survey as thoroughly 
as possible the various ways in which statistical reasoning has been 
used in physics from the classical applications to fluctuation phe- 
nomena, kinetic theory, and statistical mechanics to the contemporary 
quantum mechanical statistics. Emphasis has been laid on meth- 
odology. The author has taken the point of view that the greater 
the number of vantage points from which the subject is examined the 
deeper will be the student's understanding. For this reason no 
attempt has been made to provide a strictly unified treatment which 
would appear to be more logical to many. At the same time, how- 
ever, particular effort has been exercised to relate the various statisti- 
cal methods in order that the reader will see their similarities as well 
as their differences. A glance at the table of contents will show that 
specific applications have not been neglected, and there are numerous 
problems to test the reader's grasp of the subject. 

It is appropriate for the author to acknowledge in the preface his 
indebtedness to those who have given help or encouragement. Such 
acknowledgment is usually confined to professional colleagues or the 
writings of the masters. Too seldom is attention paid to the con- 
tribution of the author's students, whose thorough and patient study 
of the various stages of organization of the book leads to gradual 
improvement in the correctness and clarity of presentation. The 
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author feels that he owes a definite debt of gratitude to the graduate 
students at Brown University who during the past few years have 
taken the course on which the book has been based. Particular 
acknowledgment is due to Mr. J. A. Rich for help with the diagrams 
and the proof. 

R. B. Lo 

June, 1941. 
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CHAPTER I 
DYNAMICAL AND STATISTICAL THEORIES 

1. THE METHOD OF PHYSICS 

Physics is an attempt to describe a certain portion of human 
experience. From the totality of sense perceptions, the physicist 
abstracts certain ones for special study and by observation and ex- 
periment develops a body of propositions called physical facts. He 
then proceeds to construct on the basis of these facts certain concepts 
which are defined in terms of the more primitive ideas. Such, for 
example, are the kinematical concepts of mechanics: displacement, 
velocity, acceleration. These concepts are expressed in symbolic 
form so that they are amenable to the usual processes of mathematical 
manipulation. Next, with experience as a guide and usually also 
with liberal use of the imagination, certain relations, appropriate to 
a given set of phenomena, are postulated among the concept symbols. 
These with the concepts themselves form the hypotheses of the phys- 
ical theory which is supposed to be the ultimate physical description 
of the phenomena. From these postulated relations, usually in the 
form of differential equations, can be derived by mathematical anal- 
ysis the laboratory equations which are called physical laws. 

Physical laws are equations containing quantities which have a 
direct operational significance in the laboratory and to which numbers 
can be assigned by experiment. Hence these laws are susceptible of 
experimental test. If they meet this test successfully the theory 
from which they have been deduced is to that extent successful and 
a valuable element of physical description. This does not mean, 
however, that the theory is thereby proved to be true. In the first 
place there may be quite another theory operating with different 
concepts and different hypotheses which yields the same laws; in 
the second place the attempt to extend the theory to include just 
one additional bit of experience may fail of verification. It is much 
safer therefore to say of a physical theory that it is successful or un- 
successful: the successful theory is one which not only implies laws 
agreeing with already known experience but also predicts laws for 
experiments that have not yet been tried and which leads to complete 
verification on experiment. Such a theory adds 

1 
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knowledge, since it suggests experience which presumably no one 
had previously taken the trouble to acquire. 1 

It is clear that different kinds of physical theories can be con- 
structed. The above remarks suggest that in any comparison of 
these we should then consider the following elements: (1) the fun- 
damental concepts, (2) the postulated relations or basic hypotheses, 
and (3) the types of physical laws resulting from (2). On this basis 
we shall now try to compare two different theoretical structures of 
importance, viz., the dynamical and the statistical. 

2. THE NATURE OF A DYNAMICAL THEORY 

The theory of dynamics or mechanics, as it is often called, has 
had remarkable success in describing a wide variety of natural phe- 
nomena. It is the oldest of physical theories and its concepts, hy- 
potheses, and laws have been so thoroughly studied that they have 
acquired an air of familiarity not shared by those of any other theory. 
This does not mean that the structure of dynamical theory is less 
abstract than that of electromagnetic theory, for example, but merely 
that we have got so used to thinking in terms of mechanics that we 
no longer feel the abstractness. Its essential aim is to describe first, 
all the observed motions of bodies and second, the physical phe- 
nomena in which motion is not actually observed, in terms of the 
motions of invisible bodies. The fundamental dynamical concept 
is the material particle 2 which is assumed to have position without 
extension, the property of inertia whose measure is mass, and certain 
relations with respect to other particles, e.g., gravitation. The con- 
cepts of displacement, velocity, and acceleration of a material particle 
are constructed using the primitive notions of space and time. Cer- 
tain hypotheses are then introduced, forming the content of what 
are commonly known as the "laws" of motion, e.g., F = ma. These 
might better be called the " principles" of mechanical theory, since 
from them by suitable manipulation the laws, i.e., the laboratory 
equations containing time and distance, etc., which describe the 
actual motions of particles, can be derived. 

Now classical mechanics, constructed in this way, has been very 

1 This statement of the method of physics is a highly compressed treatment 
which naturally does not pretend to do justice to the profundity of the theme. For 
a more elaborate statement the reader is referred to Lindsay and Margenau, ''Foun- 
dations of Physics," Chapter I, John Wiley & Sons, New York, 1936. 

2 Cf ., op. cit., Chapter III for an extensive exposition of the fundamentals of 
mechanics which are here given only in abbreviated form. 
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successful in dealing with the motion of a single particle in a fixed 
reference system, of two particles moving subject to their mutual 
interaction (a much more practical situation, of course, than the 
highly idealized isolated particle), and of a large number of particles 
which are rigidly connected so that they exhibit no relative motion 
(rigid dynamics). Classical mechanics has also proved itself adequate 
to describe the motion of ideally continuous aggregates of particles 
such as fluids (hydrodynamics). But it is important to recognize 
that the general problem of the motion of a large number of discrete 
particles subject to the action of arbitrary forces cannot be com- 
pletely solved by classical dynamics. It is true that much can be 
learned about the motion from the well-known deductions from 
dynamical principles: the laws of the conservation of mechanical 
energy and momentum. The complete solution would imply that, 
given the initial conditions (the initial position and initial velocity 
of every particle) the position and velocity of every particle are de- 
terminable for all time. Even if the solution of the problem could 
be given it would be useless for a very large number of particles, be- 
cause of our inability to assign the initial conditions. Suppose, for 
example, that we are trying to describe the behavior of a gas on the 
assumption that it consists of a very large number of tiny particles 
which move under the action of their mutual forces in accordance 
with the principles of mechanics. It turns out that the number of 
particles (molecules) per cubic centimeter of the gas has to be so large 
that it is completely hopeless to try to specify position and velocity 
for all of them at any one instant. It then appears that the dynamical 
method suffers here a serious check. 

In a situation of this kind the physicist does not give up in despair. 
He looks around for possible ways out. Two at least would appear 
to be available. In the first place one could decide to replace the large 
number of discrete particles by an ideal continuum and apply me- 
chanical principles to this. Unfortunately it appears that while, in 
following this course, we are able to describe certain properties of 
the gas, there are other very important ones which elude this mode 
of attack. Therefore we fix our attention on the alternative pro- 
cedure, which is simply to forego exact information about individual 
particles and to allow our questions to concern merely the average 
number of particles which have a certain range of properties at a 
given instant. When we do this we are departing from strict dynami- 
cal theory and are introducing a method which has received the name 
statistical. 
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3. THE NATURE OF A STATISTICAL THEORY 

We have now to try to make clear how a statistical theory differs 
basically from a dynamical theory in physics. To some persons this 
may well appear a futile task, for it will be asked, how can one under- 
stand a new theoretical point of view until he has studied all its de- 
tails? The answer is that unless he is given some idea about it and 
its relations to things otherwise familiar he may not wish to study 
it at all; at any rate he will not study it with the same appreciation. 
People have a certain curiosity to know something about what they 
are getting into when they begin to learn a new discipline. Without 
laying claim to any particular profundity, in this section we shall 
go a little way toward satisfying this curiosity. 

At first thought it might seem that the difference between a statis- 
tical theory and a dynamical theory is not so very great, at any 
rate in the case we have above described, viz., the motion of a very 
large number of particles. The fundamental concepts used are those 
of classical mechanics, i.e., velocity, acceleration, force, mass, energy, 
etc. Moreover, the principles of mechanics are employed as funda- 
mental assumptions, but the kinds of laws derived are not those of 
classical dynamical systems, and we have to look closely to find 
just where the difference lies. In the first place the statistical laws tell 
us nothing about any particular individual particle; they are entirely 
concerned with numbers of particles which have certain properties, 
e.g., position in space, velocity, momentum, energy, at a given in- 
stant. Moreover they do not even pretend to tell us the precise 
number of particles in any case, but only an average number, from 
which there will in general be fluctuations. The idea of an average 
quantity is not foreign to classical dynamics, where we often find it 
useful to speak about the average velocity of a particle, or the average 
kinetic energy of a simple harmonic oscillator, for example. In clas- 
sical dynamics, however, we never speak of the average number of 
particles in a given state; this is a definite criterion distinguishing 
between the two modes of description, even though the same funda- 
mental concepts occur in both. 

Moreover, even though both classical dynamics and statistical 
physics employ averages their significance is quite different; when 
in dynamics we employ averages over time it is merely a matter of 
convenience and not because we cannot compute the precise values 
of the quantities in question at any instant. One of the fundamental 
characteristics of mechanical laws is that once the boundary and 
initial conditions have been inserted, they predict precisely the values 
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of all the variables involved at any instant. Dynamics implies abso- 
lute determinism for any physical system. This is well illustrated 
in celestial mechanics, wherein from a few observations the path of 
a planet or even a comet can be computed with great accuracy. Now 
things are quite otherwise in a statistical theory. Here the average 
is employed because there is no possibility within reason of predicting 
an exact value. Hence by the very use of such averages, we forego 
precise determinism. It must be clearly understood that in this re- 
nunciation we do not deny that there may be determinism in the 
system which we are studying, e.g., in a gas, the number of molecules 
in a given element of volume may actually be uniquely determined 
as a function of the time. However, the effort to follow through the 
precise Vciriation with time may be so great as to become unreason- 
able; it may defeat its own ends by rendering physical description 
too complicated to be worth while. Rather than be balked by this 
unhappy situation we decide to get along with average numbers over 
appropriate periods of time. We do not worry if the actual number 
at any instant within such a volume interval differs from the average, 
as long as experimental observations indicate that this difference 
does not become too large or persist for too long a time. 

The reader who has followed the above paragraph closely will 
undoubtedly be inclined to ask why the statistical theory is effectively 
any less deterministic than the dynamical theory if we can calculate 
average numbers precisely and get agreement with experiment. Is 
not that all that can be expected of a theory? The answer to this 
question evidently involves the way in which the averages are calcu- 
lated. As we have emphasized, when we compute the average of a 
quantity in classical mechanics, we know the value at every instant 
of time or every point in space as the case may be; the calculation 
of the average is a mere matter of convenience. In the problems 
treated by the statistical method we do not know the instantaneous 
values of the important quantities, and so the question at once arises: 
how do we propose to calculate their averages? To answer this 
query fully is indeed the function of a book on physical statistics. 
However, we can at least say here that the calculation of statistical 
averages is based fundamentally on the concept of probability, a 
concept which seems to have by no means so clear a meaning as the 
concepts of mechanics, for example, but of which nevertheless we 
all feel we have an intuitive grasp. This notion is foreign to dynamical 
theory, though it enters into every element of experience including 
the most precise of physical experiments, where it is reflected in our 
I/eatment of the variations in the successive measured values of the 
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same quantity, i.e., the theory of errors. To this extent dynamical 
theory is a highly idealized model since it neglects a very significant 
feature of experience. On the other hand the human urge to believe 
in causality has hitherto placed a high premium on the value of a 
deterministic theory. This is a good illustration of the competition 
in the construction of concepts which is constantly at work in the 
human mind in its attempts to describe experience. There are ele- 
ments in experience which make us feel that determinism is the correct 
point of view in physics and incline us toward dynamics; there is 
also much experience which emphasizes the importance of chance. 
This inclines us toward the type of theory which openly employs 
chance; statistics is a theory of this kind. 

There is still another difference between dynamical and statistical 
theories. This is the distinction between reversible and irreversible 
processes. Let us consider a very simple example, namely a particle 
which moves with constant acceleration a along the x axis. We 
assume the initial conditions # = 0, i = at / = (NOTE: the dot 
notation is used to indicate differentiation with respect to the time). 
Suppose the particle reaches the position x = x\ at time / = /i, where 
%i = a/i/2, in the usual way. Now suppose that at / = /i the 
existent velocity xi = a/i is reversed. What happens? We now 
have the new initial conditions x = #1, x = at\, whence at the 
end of a second time interval of magnitude /i, the distance of the 
particle from the origin becomes x = x\ at\ + at\/2. But from 
#1 = 0/i/2 it follows that x 0. Also the final velocity at the end 
of the second time interval is x = at\ + #/i == 0. This means 
that the motion of the particle has completely reversed itself. By 
the mere reversal of the velocity we have brought the particle from 
the state it had reached at time t\ back to the original state from which 
it started. We could have achieved the same result by changing the 
sign of the time / = t\ and allowing the t parameter to go from 1\ 
to zero. It is seen that this is mathematically equivalent to reversing 
the velocity. In any case we have here an example of a reversible 
motion or process, i.e., one which by suitable manipulation of the 
parameters can be made to reverse itself and proceed back through 
all its successive previous states to the initial state, without at the 
same time changing the state of any other physical system. It is 
a comparatively simple matter to prove 3 that strictly dynamical sys- 
tems undergo reversible processes only. 

Now let us consider a somewhat different illustration. We sup- 
pose that the particle just discussed moves along the x axis under 

1 For a short demonstration see op. cit. t p. 195. 
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the action of a constant force, but that in addition its motion is re- 
sisted by a force varying directly as its velocity, or in general of such 
a character that it always opposes motion ; in short, we suppose that 
a frictional force acts on the particle. If now we carry out the same 
sort of process indicated above, it will be found that the motion does 
not reverse itself as before; the particle does not return to its original 
position and velocity after time t\ if the velocity is reversed at / = t\ 
when x = x\. In fact a simple calculation (left for the reader to per- 
form) shows that the particle always falls short of reaching its initial 
position under these conditions. This is an illustration of an irrever- 
sible process, or one which can not be annulled simply by reversing 
some of the parameters of the system in question without disturbing 
or changing in any way the environment. Thus, in the example just 
described the only way to get the particle back to its initial state in 
the time interval t\ after the reversal of the velocity at x = x\ at time 
/ = /i, will be for some outside influence to compensate for the dis- 
sipative effect of the friction. 

The distinction between reversible and irreversible processes is 
of fundamental theoretical significance. Owing to the prevalence 
of frictional forces, it is clear that irreversible processes are actually 
the rule in nature. The question then arises: Why do we use the 
concept of reversible process at all? The answer is that this type of 
process is associated with the dynamical method of description as 
the above illustration (first case) has just showed. However, the 
further question will immediately be asked: Did we not effectively 
give a dynamical description of the irreversible process also and does 
this not mean that both reversible and irreversible processes can be 
described by dynamics? It will be noticed, however, that the solu- 
tion of the second problem was rather artificial since we assumed a 
frictional force proportional to the velocity without in any way seek- 
ing to understand the nature of the frictional force more closely. 
Hence although it is true that classical dynamics can handle some 
kinds of irreversible processes, it is only in a rather formal way and 
difficulties are encountered as soon as a more thoroughgoing treatment 
is contemplated. We can put the matter thus: if we leave out of 
dynamics forces varying as some odd power of the velocity, classical 
dynamics describes only reversible processes. By formal generali- 
zation one can make dynamics describe certain irreversible processes, 
namely, those in which small frictional dissipation enters. The essen- 
tial reason for this is that, although friction is always present in 
natural motions, by suitable manipulation we can make it so small 
as to have a negligible effect over a period of time which is of interest 
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to us. Thus consider as an example a simple pendulum swinging in 
air. The simple dynamical theory of the motion describes it as 
periodic, so that once started the pendulum swings indefinitely with 
a characteristic frequency and the same amplitude with which it 
started. Actually it is not observed to do this; each swing is a bit 
smaller than the previous one and the motion gradually comes to 
rest. If, however, the motion takes place in a region from which 
some of the air has been removed, the dissipative effect is observed 
to be much less and the pendulum takes a much longer time to come 
to rest. We therefore feel that in the limit of no frictional resistance 
the motion would be completely periodic and hence reversible. Actu- 
ally even in motion in air if we are content to restrict ourselves to a 
time interval which does not exceed too many periods of the motion, 
the dissipation can be neglected and the dynamical treatment is for 
many purposes satisfactory. It comes down to this: for the sake of 
what we call simplicity we use the dynamical theory with its con- 
comitant reversibility when it leads to approximately correct agreement 
with experiment, i.e., in which the irreversibility can be neglected 
in the ideal limit. 

Now there are phenomena in nature which appear to be so funda- 
mentally irreversible that in no ideal limit can we consider them as 
reversible. Such is the flow of heat. It is an experimental fact that 
heat is always observed to flow from a body of higher temperature 
to one of lower temperature as long as no outside influence is imposed. 
If this process were reversible it should be possible, without in any 
way disturbing the state of other bodies but merely by reversing the 
sign of some parameter connected with the flow, to make the flow 
proceed in the other direction. Actually there appears no way of 
doing this: to make heat flow from a low temperature to a higher 
temperature requires external work (as in a refrigerator). Hence the 
process is not reversible. 

Is there anything inherent in the statistical point of view which 
renders it particularly fitted for the description of systems under- 
going irreversible processes? The answer is yes, for we have seen 
that the method of statistics uses average distributions of particles 
with respect to certain properties. It calculates these averages by 
the use of probabilities and the natural assumption is that those dis- 
tributions will actually be realized which have the highest proba- 
bility. Hence there will be a tendency for distributions to change 
in the direction of increasing probability. This change is evidently 
a one-way process and by nature irreversible. To be sure, there is 
involved the curious circumstance that, since we have only proba- 
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bility considerations to guide us, there will always be a finite proba- 
bility of any process going either way, e.g., in the conduction of heat, 
we cannot rule out as impossible the uncompensated flow of heat from 
the cooler to the warmer body. All we are allowed to say is that the 
flow in the other direction is overwhelmingly more probable. 4 

4. NON-MECHANICAL STATISTICS 

In the previous section we were led to the concept of a statistical 
theory through a type of problem which actually employs the funda- 
mental concepts of mechanics but in which the method of dynamics 
is unable to give a complete solution. The type of statistical theory 
which uses mechanical concepts is very important for physics but it 
must be pointed out that it is by no means the only useful kind of 
application of statistics in physics. There are some physical phe- 
nomena in which the concepts of mechanics seem to play no role at 
all. An example is the phenomenon of radioactive decay which appears 
to be best described by saying that in a special, simple case of disinte- 
gration the number of radioactive atoms disintegrating per unit time 
is directly proportional to the number present. No mechanism is 
provided to govern the disintegration and the treatment is purely 
statistical, leading to the well-known equation 

N = N e~" (1) 

giving the number of atoms undisintegrated after time / if NQ is the 
original number present. It is to be observed that nothing remotely 
connects this formula with dynamics. It merely associates a number 
with the time and does so on a probability basis, i.e., the number is 
an average. We may well expect fluctuations from it when the experi- 
ment which it describes is repeated again and again. We must make 
one more important observation: the number N must be a large 
number if the formula is to have much meaning. This, of course, is 
universally true of statistical formulas: they lose their significance 
if the numbers entering into them are too small. 

It is then clear that whether we begin with the assumption (a) that 
all physical phenomena are ultimately describable in terms of me- 
chanical concepts or (b) that some physical phenomena are not 
explicable in this way, we are led to the desirability of using statistical 
reasoning in physics. It is the plan of this book to indicate how this 
can be done and to give many illustrations of the actual process 

4 For further general discussion, op. cit., pp. 196 ff., may be com 
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PROBLEMS AND QUESTIONS 

1. Prove that when a force varying directly with the velocity acts on a particle 
subject otherwise to a constant force, irreversible motion results. 

2. Make a list of irreversible processes in physics and indicate which of them 
can be replaced effectively by ideal reversible processes. Discuss some of these 
processes in detail. 

3. Prove by the use of Hamilton's principle (cf. eq. 8 of Chapter VI) that all 
conservative dynamical systems undergo reversible processes only. 

4. A particle falls under the influence of gravity through a medium which resists 
its motion by a force varying directly as the velocity. Show that the velocity of 
the particle approaches a limiting value and comment on the connection between 
this and the irreversibility of the motion. 

5. Set up the differential equation whose solution is eq. (1) of this chapter. 
Give two physical interpretations of the constant X. From the fact that N and NQ 
in eq. (1) must be integers, what mathematical difficulty do you find associated with 
the formula? Should a physicist be greatly disturbed over this difficulty? -Why? 



CHAPTER II 
ELEMENTARY PROBABILITY AND STATISTICS 

1. A SIMPLE PROBLEM IN PHYSICAL STATISTICS 

We begin our study with a definite physical problem which has 
some interest in itself and yet is simple enough to illustrate clearly 
the fundamental statistical methods we intend to develop. 

Consider a single material particle which is restricted to move along 
a straight line, say the x axis. Suppose that in time T it makes n dis- 
placements each of length /, where n ^> 1. These displacements can 
be either in the positive or negative x direction. In fact we shall 
assume that a positive displacement is just as likely or probable as a 
negative displacement. Let us further suppose that the number of 
positive displacements in time T is n\ and the number of negative 
displacements ^2, so that n\ + n% = n. The distance of the particle 
from its starting point at time 7"is then 



L = /(i - * 2 ). (1) 

Now if we knew all the forces acting on the particle we should, of 
course, be able to calculate by the principles of mechanics the exact 
value of L. However, we are here assuming that we do not know the 
forces and therefore cannot use mechanics. The best we can do is to 
try to calculate an average value of L. To see the meaning of this, 
imagine that it is possible to carry out an experiment and observe 
L directly. Further suppose that the experiment can be repeated 
many times, very likely with differences in the values of L obtained. 
We could then find an average L by direct arithmetical means. 
However, life is too short to spend our time on such experiments. 
What we should like to do is to calculate an average L with the hope 
that it will agree well enough with that experimentally observed to 
serve as the basis for further theoretical predictions about the behavior 
of the particle. 

As stated in Chapter I, irr order to calculate an average when we 
do not know the detailed time-course of the phenomenon, we must 
have and use some probability values. This means that we must be 
able to compute the probability associated with each value of L. 

11 
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2. ELEMENTARY PROBABILITY 

It is not practicable in this book to explore all the problems anc 
controversies associated with the definition of probability. It is 
obviously a concept difficult to define logically so as to be proof againsl 
all objections that can be brought against it. Fortunately for the 
present purpose we can be content with the simple "frequency" view- 
point, which is essentially that of v. Mises. 1 If we consider once 
more the n displacements of the particle (Sec. 1) in time T, an impor- 
tant concept is the total number of ways in which the n displacements 
can be divided into two groups, i.e., positive and negative, without 
any restriction on the number in each group. If there were only ont 
displacement it could take place in either of two ways, viz., either 
positive or negative. If there were two displacements there would be 
four ways of performing them, as indicated in the following table 

First displacement -j- + ~ 
Second displacement + + 

For n displacements the number of ways in question is 2 n . This is a 
mere matter of counting: there are two ways of performing the first 
displacement, two ways for the second, , two ways for the nth. 
They are all assumed to be independent of each other, i.e., the fact 
that the first happens to be positive or negative entails no restriction 
on any of the subsequent ones. Hence the total number of ways of 
grouping the displacements is 2 2 2 to n factors or 2 n . Now of 
all these ways there will be a certain number corresponding to n\ 
positive displacements and n% negative displacements (where n 
ni + n 2 ). This number is very readily obtained. We can choose the 
first positive displacement in n ways, the second in (n 1) ways, 
the third in (n 2) ways, and finally the With in (n n\+ 1) 
ways. Since these are all independent the total number of ways 
desired would appear to be n(n 1) (n n\+ 1). But many of 
these correspond to mere rearrangement of the n\ positive displace- 
ments. We are not interested in the order in which the displacements 
occur but merely in their number and must therefore divide the 
above by the number of ways in which n\ displacements can be re- 
arranged among themselves, namely, n\ !. Therefore the total number 

1 R. v. Mises, " Wahrscheinlichkeitsrechnung," Leipzig, 1931. For a brief re- 
view of this point of view, consult "Foundations of Physics," pp. 159 ff. 
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of ways of dividing the n displacements into HI positive and n% nega- 
tive without regard to order is 



(n\ n( 

w ~ 



By multiplying numerator and denominator by n 2 l we can bring this 
into the more symmetrical form 



The situation then is this: Out of 2 n ways of grouping n displace- 

ments into the two classes, positive and negative, there are 1 ) ways 

V*i/ 
for which n\ are positive and n n\ = n 2 are negative without regard 

for order. The relative frequency of occurrence of the HI, n 2 combina- 

tion is therefore given by the ratio ( )/2 n , and we shall call this the 

\ n i/ 
probability P Hl of the occurrence of such a combination. Thus 



P _ 1 ( n \ - 1 n[ 

"' 2"'W 2-'n 1 ! a l' 



In our problem this is the probability connected with the value 
L = l(n\ n 2 ) for the final displacement of the particle from its 
initial position after time T. 

The reader familiar with v. Mises' notation will see that we are 
assuming that the total number of ways of grouping the n displace- 
ments into the two groups forms a so-called probability aggregate. It 
is unnecessary, however, to discuss the fundamental nature of such 
an aggregate. We note only that the quantity P ni in (3) is always a 
proper fraction and that 







(4) 



3. CALCULATION OF AVERAGES 



We are now ready to use the results of Sec. 2 in the calculation 
of the average value of L. To obtain this we have merely to multiply 
the value of L corresponding to a particular n\,nz combination by 
the probability associated with this combination and to sum the 
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products so obtained over all values of n\, namely from to n. If 
we denote the average of L by L, we have then 



L =L, n P ni = S( Bl - 2 ) . (5) 

ni = 

The problem now is to evaluate (5). Expanding the sum gives 

.2 (6) 



f J i 



The computation of 1*n\ is based on the fact that the quantities 



are actually the coefficients in the binomial expansion. From 
elementary algebra 



nj-O 

If we differentiate both sides of this identity with respect to x we get 



711=0 

Since x is arbitrary we can set it equal to unity and have 



'*" 

which in connection with (5) and (6) at once leads to 

L = 0. (9) 

This is a not unexpected result. It means simply that on the average 
the number of positive displacements n\ equals the number of negative 
displacements n% which is, of course, inherent in the initial assumption 
that a positive displacement is just as likely as a negative one, so 
that indeed we must expect n\ = n/2. Mathematically speaking, 
this result is trivial. We have presented the analysis mainly because 
of its future utility. 

There will still be some interest in an average which gives some 
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information about the absolute value of L. Let us average L 2 instead 
of L. Thus from (1) 



Differentiation of (7') with respect to x yields 
n(n - 1)(1 + *)"- 2 = Sn^m - 
whence, with x = 1 as before, 

n(n - 1)2- 2 = S 
Then 



The other terms in (10) are already known. Hence substitution yields 

L 2 = nl 2 . (11) 

The square root of L 2 , which we may call the root-mean-square value 
of L, will serve as a kind of average of L without regard to its sign. 
Thus 

VI* = Vnl. (12) 

There is another interpretation of this result. Let us compute the 
mean square deviation of L from its average value. This is called in 
statistical parlance the dispersion and usually denoted by a 2 . We 
then get the important general relation 



a 2 = (L - L) 2 = L 2 - 2L 2 + L 2 = L 2 - L 2 . (13) 

In the special case under consideration L = 0, and therefore <r 2 = L 2 . 
Here the mean-square value of L is simply equal to the mean-square 
deviation of L from its average. The square root of o- 2 , or root-mean- 
square deviation from the average, is usually known as the standard 
deviation. In our problem the standard deviation is the root-mean- 
square of L itself. 

The concepts of mean or average value, dispersion and standard 
deviation, are so important for statistical reasoning that we may well 
pause a little to discuss them further. The idea of average, indeed, 
has probably been discussed enough to be reasonably clear. If we 
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were to make several repetitions of the experiment of observing the 
particle after an interval T, we should expect that some of the dis- 
placements from the origin would be positive, others negative, but the 
average close to zero. We have no reason to expect the average to 
be exactly zero, but if we found on repeated trials that the average 
failed to stay close to zero, we should suspect that we were over- 
looking some important feature of the problem and that our statistical 
reasoning was not applicable or at any rate not applicable in the 
simple form here assumed. In the standard deviation we have an 
additional test. The successive tries will yield values of L different 
from L, i.e., we must expect deviations or fluctuations from the mean. 
However, we have just shown we can calculate an average value for 
these fluctuations, viz., -\/nl in our problem. We should expect that 
the observed value of this deviation for a great many trials would 
not differ markedly from \/nl. If it did we might suspect that 
something was being neglected. 

Now in most applications of statistics in physics we do not and 
indeed cannot actually proceed to verify our fundamental assumptions 
quite as directly as all this. Rather we assume at the outset that 
the statistical method is the one appropriate for the problem and 
assign probabilities in what appears to be the most plausible manner. 
From these we calculate averages and what is more important, 
relations among these averages and parameters which by the nature 
of the case have fixed values. These relations are the laws (in the 
sense of Chapter I) which should describe the phenomena in question. 
We then proceed to identify the average values with the actual results 
of experiment and hope to find that the statistical laws we have 
derived really do provide an accurate description. 

The fluctuations mentioned above are not to be dismissed as 
unimportant, for at times the theory may predict rather large ones 
and then experiment should certainly reveal them if the theory is 
applicable. 

4. LAPLACE'S FORMULA 

The formula (3) for the probability of the occurrence of n\ positive 
displacements with n 2 negative ones in a total of n = n\ + ^2 (usually 
called Newton's formula) is not convenient for mathematical manipu- 
lation since it contains the important number n\ and consequently L 
in the form of the factorial. There is advantage in expressing the 
probability explicitly as an analytic function of Z/. This has been 
done in an approximation formula associated with the name of 
Laplace. 
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Let us express ( I in terms of L. From L = (w t n^l and 

n\ + 2 = w we have at once MI = (n + L/l)/2 and #3 = (w L/l)/2. 
For simplicity let I-/2/ = x. Then we can write 

in \ w! n![(n/2)!] 2 

W ~ (n 



(n/2 + x) ! (n/2 - x) ! (n/2 + x) ! (n/2 - *) ! [(w, 

n! n/2- (n/2 - l)---(n/2 - a; + 1) 

[(n/2) !] 2 - (n/2 + *) (n/2 + -!)... (/2 + 1) ' 



(14) 



The form (14) assumes tacitly that x is positive. The form will change 
if x is negative but the reader can show that the same ultimate result 
is obtained. It is assumed, of course, that n is a very large number 
compared with x. Multiply both numerator and denominator by 
(2/n) x and get 



(n\_ n! 

W [(w/2)!] 2 ' 



- (2* - 2)/n) 

V J 



2*/n) 



Under the assumption just made (15) can be written to a very close 
approximation 

n! g -i/ 



CO- 

Making use of the arithmetical progression formula 2 + 4 + 6 + + 

2x = x 2 + x, etc., we can finally write ( ) in the form 

\ w i/ 

/n\ n! 2xV n! , 2/2 . 

I I . 6X /n lj*/Zni* /4 *T\ 

W~[W27!F e = RW27!? ' 07) 



in which L is now released from the bondage of the factorial and 
appears in explicit form in the exponential. Equation (1 7) is commonly 
termed Laplace's formula. The function e~ ax * has long been known as 
the Gauss probability or error function. It has the well-known form 

indicated in the accompanying Fig. 2-1, where we have plotted ( 1 

\ n i/ 
as a function of x. Strictly speaking Laplace's formula represents 

() accurately only for x n/2 and its value for larger values of x 
nif 
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might appear highly questionable. Fortunately we arc usually inter- 
ested in such large values of n, that for large x the value of the ordinate 
is very small indeed compared with its value at x =* 0, and the per- 
centage error involved in the use of the analytical formula in place of 
/n\i Newton's factorial formula 

' n i/f becomes very small. This 

enables us to use Laplace's 
formula in the evaluation 
of averages, for which its 
form renders it particularly 
suitable. The error function 
is an even function, i.e., the 
value for given x is the 
same as for x, expressing 

. 2 1 . the fact that a given positive 

value of L is equally prob- 
able with L. Moreover the maximum value occurs for x or 
L = 0, indicating maximum probability for the average value. 




5. EVALUATION OF THE COEFFICIENT IN LAPLACE'S FORMULA. 
STIRLING'S FORMULA 

Practical use of Laplace's formula (17) can be made only if the 
factorial coefficient n\/[(n/2)\] 2 is evaluated in terms of a simple 
function of n. This involves essentially the development of a formula 
for n\. Such a one is Stirling's formula. There are several ways of 
deriving this presented in books on advanced calculus. We shall give 
a brief derivation here as an excuse for introducing the gamma function, 
which will be of considerable use to us later. Thus by definition for 
any real positive n and z (we are here contented with real variables) 



T(n) 



/CO 
z n~l 
v 



e z dz, 



(18) 



r 
whence a partial integration gives T(ri) = (1/w) / z n e z dz, so that 

/o 

nT(n) = T(n + 1). But since T(l) = 1 by direct integration, we have 
for integral values of n 

T(n+ 1) = nl, (19) 

so that 



r 00 

nl = I z n e" z dz. 
Jo 



(19') 
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If we make the substitution z = n + \/nu, taking advantage of the 
fact that for z = n or u = the integrand in (19') attains its maximum 
value, 



n \ = 



* 1 f 

J-^fr 



We can now write log (n + \/nu) = log n + log (1 + \/l/nu) and 
expand the second term in a Maclaurin series, getting log (1 + \/\/nu) 
== -\/l/nu u 2 /2n to terms of the second order which are sufficient 
for our purpose since n is assumed to be large. Substitution into (20) 
then yields 

r* 
n\ = \/ne~ n n n I e~ u/2 du. (21) 

/-Vn 

Now the integral 

>00 

-** /2 du = *(*) (22) 



f 

*/x 



e 



is known as the error function and plays an important role in the 
applications to follow. The particular case when x = is important. 
We have 2 

/oo 
e~ u * /2 du = 1. (23) 


Consequently (21) becomes 



n\ = ne~^n n [^2 - V^ &(Vn)]- (24) 

But as x increases indefinitely <(*;) > 0. Hence, since n is assumed to 
be large we are justified in neglecting A/Tr/2 $(\/ri) compared with 
Y/27T. This yields the approximation 




n\ = V2^n(- ) (25) 



The demonstration is not intended to be taken as mathematically 
rigorous. Stirling's formula is, however, a very useful eind surprisingly 
accurate one, with a very small percentage error even for n as small 
as 10. The absolute error grows with n, but this is of little conse- 
quence in most physical applications. 

We are now ready to apply the expression (25) to Laplace's 
formula (17). The result is the approximate formula 

,2 /2n/2 /^/^\ 

--e ' . (26) 



2 Cf., for example, F. S. Woods, "Advanced Calculus," p. 153, Ginn and Com 
pany, 1934, or any similar book on the subject. 
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This at once shows that ( U J is a maximum for n\ = n% = n/2 or L = 

V*!/ 
and decreases more or less rapidly as L increases. 

It is now desirable to associate a probability not with a single 
definite value of HI or L but a finite (though small) range of values 
of L. We shall agree to define the probability associated with the 
interval L to L + dL as 



' (5)- 



<27) 



Since the probability is a pure number with no physical dimensions 
we use d(L/2l) instead of dL to denote the interval. Moreover 
L/21 = x measures the deviation of ni and n 2 from their mean values. 
In the expression (27) PL appears as the probability associated with 
unit interval of L/21 in the neighborhood of L/21. The form of (27) 
is governed by one other criterion : if we integrate both sides with 
respect to the independent variable L/21 and allow L to take on the 
limiting values nl and +nl we expect to obtain the value unity since 
it is certain that L will be somewhere in the interval from nl to +nl. 
We thus demand that, as n <*> , 



/ 



(28) 



where the limits have been changed to those of x. From (22) and 
(23) there results 



and since from the even character of the integrand, the integral in (28) 
is double that in (29), we conclude that as n becomes very great, (28) 
is true if PL is given by (27). 

Equation (27) is usually known as the normal or Gaussian distri- 
bution law. It has already been emphasized that it is a good repre- 
sentation of the Newtonian algebraic distribution law (3) only in the 
range where L/l < n. This does not limit its usefulness as much as 
might be supposed, since in the region where L is large, the large 
exponent in the exponential function makes PL vanishingly small in 
any case. This fact renders the normal distribution law very useful 
in the calculation of averages. 
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6. USE OF THE NORMAL DISTRIBUTION LAW IN THE CALCULATION 

OF AVERAGES 

The advantage of the normal distribution (27) is that it corre- 
sponds to a continuous variation in L and therefore reduces the 
averaging process to a mere matter of integration. An objection may 
be made that the actual distribution is not continuous, for L makes 
only discrete steps. Nevertheless, if n is very large the difference 
from step to step becomes a very small fraction of the maximum L, 
and the variable L/21 = x can be considered continuous to a sufficiently 
good approximation. Indeed it is not to be expected that the normal 
distribution will apply to all statistical problems in physics. We shall 
later meet with some in which it is an extremely poor approximation. 
Nevertheless here, where the probabilities of positive and negative 
displacements are equal, if x <<C n, the value of the distribution 
cannot be doubted. We shall see indeed that it gives the same values 
of and a 2 as were obtained above with the algebraic formula. 

Thus for the average x ( = L/2/), the usual way of defining an 
average for a continuous distribution gives 

x = / xP L (x)dx = ~ f xe~ 2x * /n dx = 0, (30) 

./-oo V 27TW i/_oo 

where the result of the integration follows at once from the fact that 
xe~ 2x * /n is an odd function of x. The limits of the integration are 
from - o to + , whereas strictly speaking x is limited to the range 
n/2 to +n/2. But if n is very large the introduction of the infinite 
limits involves no greater error than is already present in the choice 
of PL(X). We shall hereafter consistently use the infinite limits for the 
sake of mathematical simplicity. 

The average value of the magnitude of x, i.e., | x \ , is 

rxe~ 2x * /n dx = -4= \/n - 0.399 \/n. (31) 
V27T 



1 X ' \/2irn >/o 
This gives 



\L\ = 0.798 Vnl (31') 

In similar fashion 

, rx 2 e~ 2x * /n dx = ^. (32) 

'27TH JQ 4 

This result is in agreement with the one obtained by purely algebraic 
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means in eq. (10). For L 2 = 4/V = w/ 2 , leading to VZ 5 = I \/n or 

V^ = 0.500 V, (33) 

which is also a measure of the standard deviation of the distribution as 
a function ofjc. It is of interest to note that whereas the values of 
J#J and V a? actually differ in value by about 20 per cent, neverthe- 
less they are of the same order of magnitude. If they had proved to be 
of quite different orders of magnitude, we might well have had our 
doubts about the value of the distribution law. It is natural to inquire 
about the averages of higher order, e.g., V x*. We see at once that the 
odd power averages all vanish for the same reason that x = 0. The 
reader may show that for k even 

V^ = [1-3-5 ... (ft- V] l/k ~- (34) 

The significance of this result is that all the averages are multiples 
of -\fn with coefficients slowly increasing with k. The root-mean- 
square average is, of course, the most useful one in statistical problems. 
There is, to be sure, another type of average somewhat different 
from those just considered. This is the arbitrarily termed "probable " 
average. It is the value of x, usually denoted by XP, such that there 
are just as many cases in which x exceeds this in absolute value as 
there are cases in which x is less than this in absolute value. This is 
equivalent to saying that the ordinate for | XP | divides the area under 
the probability curve (Fig. 2-1) to the right of the origin into two 
equal parts. This area has however the numerical value ]/%. Conse- 
quently 9 r x p 1 

/ e-**'*d 

JQ 



4 

By consulting the table of values of the probability function $>(#), we 
find that this requirement can be uniquely satisfied by 

x P = 0.338 Vn, (35) 

approximately. We note the agreement in ordejnof magnitude with 
| x | and v x?. Placed in order, XP < \x\ < vx 2 . 

An important measure of the significance of these average values is 
found in the probability that the value of the deviation x shall not 
exceed them. The probability that x shall not exceed the value #o 
in absolute magnitude will be written 




= . (36) 

V27TW 
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For XQ = XP, P(XP) = J^ by definition. More generally, from the 
definition of the error function $(x) in eq. (22), 

(37) 



ThusP(T*T|~) = 1 - $(0.798) = 0.5 76, while P() = 1 - 
0.683. We can express this as follows: On the average there are 576 
chances in 1,000 that a deviation or fluctuation will turn up which is 
equal to or less than, i.e., not exceeding \x\ 9 while there are 683 chances 
in 1,000 that the deviation x shall not exceed Vj?. There are 500 
chances in 1,000 that x shall not be greater than Xp. The probability 
rises with the magnitude of the average. 

7. UNEQUAL A PRIORI PROBABILITIES 

The physical problem so far treated in this chapter is a rather 
idealized one, though a somewhat more general case of it occurs 
in the so-called Brownian movement of small colloidal particles 
suspended in a liquid or a gas. This we shall discuss later in some 
detail. One of the fundamental idealizations in the problem is the 
assumption that a positive displacement of length / is just as likely as 
a negative displacement. There are few problems in physical statistics 
in which a simple assumption of this kind can be made with any success. 
Therefore, although the mathematical developments we have carried 
out are fundamental, they are not all immediately applicable to actual 
physical situations. It should indeed be mentioned that interesting 
illustrations of the formulas of the preceding sections may be found 
in the simple physical experiment of coin tossing. This can be made 
formally analogous to the physical problem of Sec. 1 merely by 
associating heads with a positive displacement of the particle and 
tails with a negative displacement. The problems at the end of this 
chapter show how well our theoretical expressions agree with experience 
of this kind. 

Now, however, a generalization is in order. We shall assume that 
the two types of displacement are not equally likely but that the 
a priori probability of a positive displacement is p/q times that of a 
negative displacement, where p and q are positive proper fractions 
with the property that p + Q 1. The term a priori demands 
careful consideration. Two possibilities are at hand : (a) we may have 
observed by direct experiment that the fraction p/q represents the 
relative frequency. In most physica. problems this course is not open 
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and hence we must use the other possibility: (b) assume the ratio 
from the best information we have about the system in question. 

The problem of the preceding sections corresponds to p = q %. 
We now desire to derive the more general formulas. Again consider n 
displacements divided into two groups, i.e., n\ positive and n 2 nega- 
tive, with n\ + n 2 = n. If the probability of a positive displacement 
is p, the probability of n\ positive displacements independent of any 
negative displacements will be p ni . The negative displacements be- 
have similarly. Hence the probability of a combination of n\ posi- 
tive displacements with n% negative displacements, if there were only 
one way of realizing it, would clearly be p ni q n *. However, we have 



already seen that there are I ) ways of realizing this distribution. 

Consequently the actual probability corresponding to this combination 
is 



-' = CO * v '- 



(38) 



The calculation of the average displacement L in this case proceeds 
thus: 

n 

=ZX P > 

= 2/Swi ( H )p ni q n2 - nl?( n ] p ni q n *. (39) 

Since 2P ni = 1 the second term in (39) reduces to nl. Now write 
(q + px) n / \ } p ni q n *x ni . (40) 

n\ = ' 

By differentiating both sides with respect to x, we get (after letting the 
parameter x = 1) 



Therefore substitution into (39) gives 

L = n(p- q)l. (42) 

We could indeed have computed n\ and n 2 directly and have found 
by the above method "n\ np and "n"^ nq, so that L = (jT\ W^)l = 
n(p q)l. We see that the general result (42) agrees with the special 
formula (9) when p = q = J^>, Diving L = 0. However, for p i g, 
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L 7 0. In particular, for p q, we get approximately L = npl or 
practically nl. _ 

The calculation of the mean-square displacement L 2 follows in 
similar fashion with the result 

I? = nl 2 (4pq + n(p - q) 2 ]. (43) 

In the special case where p = q ^, this reduces to the previously 
obtained value (11). More interesting perhaps is the dispersion 
<r 2 = I? L . This becomes 

a 2 = 4npql 2 . (44) 

The corresponding dispersion in x = L/21 is clearly 

*l = n P& (45) 

with the standard deviation 

<r x = -\/npq. (46) 

8. GENERALIZED LAPLACE'S FORMULA 

We now wish to find the more general Gaussian distribution 
formula for the case where p 7* g. For this we shall express P in 
terms of the deviation of n\ from its average np. This deviation will 
be denoted by , so that n\ = np + u, and n<2 = nq u. The 
problem is then to obtain an expression for (38), viz., 



P = 



(np + u)l(nq-u)l 



in which the quantity u is freed from the bondage of the factorial. 
For the sake of variety we proceed somewhat differently than in Sec. 4. 
Taking the logarithm of P, we obtain 

log P = log nl log(np + u) ! log(nq u) ! 

+ (np + u) log p + (nq - ) log q. (48) 

Next we suppose that the factorials are all large enough so that we 
can apply Stirling's formula in the form (cf. eq. 25). 

log n ! = n log n n + | log 2irn. (49) 

We also take advantage of the expansion for the log (1 + x) where 
# 1 (satisfied, e.g., by u/nq and u/np), viz., 

log (1 + x) - x - \y? + . (50) 

The expansion of (48) using (49) and (50) though a little lengthy is 
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perfectly straightforward and need not be set down here in full. After 
collection of terms we get 

u* 1 u(p - q) 1 u 2 lj^_ 

10g P ~~ 



neglecting powers of u higher than the second. Examination discloses 
that the first term on the right is of greater order of magnitude than 
the third, fourth, and fifth. Consequently if we wish to keep only 
the highest order term in u, the result will be 

p = 



This is the generalized form of Laplace 's formula. It represents the 
normal or Gaussian distribution for the deviation of n\ and n% from 
their average values, np and nq, respectively. For p = q = J/2, (51) 
naturally reduces to (27). It must be emphasized that (51) holds 
only for u small compared with np and nq. 

9. VOLUME DISTRIBUTION OF GAS MOLECULES. FLUCTUATIONS 

It will now be of interest to apply the analysis developed in the 
preceding sections to a somewhat more practical problem than that 
discussed in Sec. 1. We shall consider an ideal gas, which we shall 
assume to be composed of molecules in accordance with the elementary 
kinetic theory. In a volume V of gas, in a closed space containing N 
identical and indistinguishable molecules, the molecules will be moving 
about with varying velocities colliding frequently with each other 
and the walls of the containing vessel. If some one were to ask how 
many molecules there will be in a subvolume of V, say V\, we should 
have to admit that in all probability the precise number will change 
from instant to instant and all we can hope to do is to assign an 
average value to the number N\. How can this average be calculated? 
Let us proceed in the simplest possible fashion by assigning the 
value Vi/V to the probability that a given molecule shall be in the 
subvolume V\. This leads to certainty that the molecule shall lie in 
V, i.e., unit probability, while the probability decreases as the size of 
Vi decreases relatively to V. Obviously there is no proof of this 
choice; it is only an assumption, but certainly a reasonable one. 
Similarly the probability that the molecule shall not be in V\ but in 
V 2 = V - Vi, is V 2 /V. If we denote the probability Vi/V by p 
and the probability V 2 /V by q (with p + q = 1) we should be able to 
apply the considerations of Sees. 7 and 8. In particular the proba- 
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bility that NI molecules shall be in V\ and N 2 molecules in V% (with 
N\ + N 2 = N) will be simply 



(S2) 



N- 

Applying Sec. 7, the average values of NI and N% become 

NI = Np, N 2 = Nq. (53) 

Clearly there will be fluctuations from these average values. If 
neither Np or Nq is too small, i.e., if the sub volumes are not too 
small nor the gas too rare, we can immediately compute from Laplace's 
formula the probability of a fluctuation u N\ Np. It is indeed 
given by (51) with N in place of n. This enables us to compute, for 
example, the probability that the fluctuation will equal or exceed a 
certain amount. Let us take an actual case, namely, V = 1 cm 3 , 
Vi = 10~ 3 cm 3 , with N = 2.7 X 10 19 , the number of molecules per 
cm 3 in an enclosed gas under standard conditions. Then p = 10~~ 3 and 
NI = 2.7 X 10 16 . Let us compute the probability that u/Ni shall 
equal or exceed 10~~ 3 in absolute value, or 0.1 per cent. This will be 
given to close approximation by 

P = 2 e~ u * /2Npq du 



with N = 2.7 X 10 19 , p = 10~ 3 , q = 1 - 10~ 3 , NI = 2.7 X 10 16 . 
To evaluate we change variables so that u 2 /2Npq = v 2 /2 or u 
v\/Wpq, whence 




p = Jf / e-"*dv = *(V2.7 X 10 10 ). 



For * 1, we have $(*) == \/2/Tr-e~ x * /2 /x-(\. - \/x 2 ) Conse- 
quently P here becomes approximately 



.,- 1.35X10W 



X 10 5 

This indicates that a fluctuation equal to or greater than the one 
indicated is very rare indeed. 

Another way of viewing the same problem is to calculate the stand- 
ard deviation of NI. From eq. (46) this is given by 

a = VNpq. (54) 
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In the present example, cr = 1.64 X 10 8 , aproximately. This is large 
in actual value but not in comparison with NI. Of greater significance 
than a is the fractional or relative standard deviation, <r/Ni. In the 
present case this is only 1.64 X 10 8 /2.7 X 10 16 ~ 10~ 8 or 10~ 6 
per cent. The chance of finding such a small relative standard devia- 
tion by detecting density fluctuations experimentally would appear to 
be negligible. It might be supposed that if we could take a small 
enough subvolume the chance of detection would be greater. Thus if 
V l = 10~ 8 cm 3 , p = lO" 8 ,^! = 2.7 X 10 n ,<r~5 X 10 5 and<r/Ni~ 
2 X 10~ 6 , which is some two hundred times larger than the former 
value though still small. As one looks around for possible means 
of detecting such a density fluctuation in such a small volume, one 
inevitably thinks of the effect of density change on the index of refrac- 
tion of light. 

The dependence of the index of refraction of a gas on the density 
may be written rather accurately in the form 

/ = 1 + ctp, (55) 

in which p is the density and a is a constant over a considerable range 
of variation. From this we immediately conclude that 

<*Ap 

(56) 



H 1 + ap 

since ap 1 for gases. Thus for air the average JJL = 1.00029 approxi- 
mately for standard conditions, i.e., ap = 0.00029. Hence 

^ = 0.00029 ^ (57) 

M Po 

Suppose now we consider as our fundamental volume a cubic wave- 
length of the yellow light from sodium vapor, viz., that with an 
approximate wavelength of 6 X 10~ 5 cm,. Then V\ ~ 2 X 10~ 13 cm 3 , 
NI ~ 5 X 10 and a ~ 2 X 10 3 with ff/N l ~ 5 X 10~ 4 = Ap/p , from 
the definition of density. Consequently in this case 

~1.5 X 10~ 7 . 
M 

In other words there should be a fluctuation of about one unit in the 
seventh decimal place of the average index of refraction of air for 
visible light. This is scarcely large enough for experimental detection. 
For light of shorter wavelength the fluctuation is somewhat increased 
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but is still very small. It has been shown by Smoluchowski 8 and 
others that for real gases near the critical temperature the above 
simple fluctuation theory (based on an ideal gas) is inadequate. 
Smoluchowski showed that for real gases the expression for <r/N\ 
given above is not correct. Actually analysis shows that the more 
nearly correct result is 



For an ideal gas = RT/V 2 and (v/Ni) 2 reduces to 1/Nj. or <r 2 
a V 

becomes NI, which is the approximate result given in our work by (54). 



On the other hand for a real gas near the critical point where 



dp 



is 



very small the value of (cr/Ni) 2 can become very much larger than (54) 
would predict. Actually such gases at the critical point when illu- 
minated show an opalescence which has been attributed to the density 
fluctuation. 

The blue color of the sky has also been explained by density 
fluctuations like those considered here. We shall return to the prob- 
lem in a somewhat different form when we encounter the Brownian 
motion. 

10. THE SHOT EFFECT AS A FLUCTUATION PHENOMENON 

Another interesting illustration of fluctuation phenomena in physics 
is provided by the so-called shot effect. This explains the continual 
background of noise in a loud speaker actuated by the thermionic 
current in a vacuum tube in terms of the random emission of electrons 
from the cathode of the tube. This chance emission produces current 
fluctuations in the tube circuit. If we assume that the electron 
emission is completely unordered in the sense that the motion of each 
electron from the cathode is independent of that of any other, our 
statistical formulas should apply. 

Let us assume that over a very long time T the number of elec- 
trons emitted by the cathode is N, whence the expected average 
number per second is N/T. Actually the number N t in time intervals 
of the magnitude / (<<C T) will fluctuate from the expected average 
Nt/T. The fluctuation is given by 

Ui = N t - y- (58) 

3 M. v. Smoluchowski, Ann. der phys. 25, 205 (1908). 
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The foregoing theory then indicates that the standard deviation of 
the distribution is 



and the average relative fluctuation is 

(60) 




B. Rajewsky 4 has been able to check this result by measuring the 
mean-square fluctuation in the tube-circuit current. Thus the average 
current over time T is 

/ _ 

o j. i 

while the actual current during any interval t is 

, eN t 
Il ~~' 
Hence the current fluctuation is 



The mean-square current fluctuation is 

* ~ e 2 Nt I G e 
A/ = -3 uf ~ ~2 ~^r = 

In Rajewsky's experiment he studied the emission of electrons 
from a special form of photocell constructed after the fashion of a 
Geiger-Muller counter. This allowed the counting of single electrons. 
In one particular case, for example, he counted 1,272 electrons in a 
period of 30 minutes. In our notation then T = 30 minutes and 
N = 1,272. This corresponds to an expected average of 42.4 elec- 
trons per minute. He observed the fluctuations from this figure over 
2-, 6-, 10-, and 20-minute intervals with results as follows 



t IN MINUTES I/A 


/Nt/T, OBS. 


1/VNt/T, CALC. 


2 


0.179 


0.109 


6 


0.061 


0.063 


10 


0.042 


0.048 


20 


0.019 


0.034 



The agreement in order of magnitude may be considered good enough 
*Physik.Z.32, 121 (1931). 
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to support the thesis that the shot effect comes within the realm of 
our simple statistical theory. 

Allied to the fluctuation problem just considered is the fluctuation 
in electric charge which must continually take place in all bodies 
which according to the atomic theory of the structure of matter are 
assumed to consist of positively and negatively charged particles. 
Let us suppose that a given volume of matter consists of NI particles 
with charge -\-e and N^ particles with charge ~e. If the material is 
electrically neutral we expect on the average that NI = N 2 . How- 
ever, with large NI and N 2 fluctuations are to be expected. Assume 
for simplicity that the charged particles are free and do not exert any 
influence on each other. The excess of positive over negative charge is 

d = e(Ni - # 2 ). 

The analogy between this and eq. (1) of this chapter should be suffi- 
ciently clear. If NI + N 2 = N, the root-mean-square value of 6 will 
be (of. eq. 12) 



which is a measure of the average fluctuation from complete electrical 
neutrality. The neglect of the electrical interaction between the 
charged particles makes this result too large and of questionable 
utility. Nevertheless it is conceivable that such fluctuations may 
some day be observed with sufficiently sensitive apparatus. 

11. RADIOACTIVE EMISSION AS A FLUCTUATION PHENOMENON 

In Sec. 4, Chapter I, we commented on the phenomenon of radio- 
active decay as describable in statistical terms. If this is so, we ought 
to be able to apply the reasoning of Sec. 7 to the emission of a 
particles from radioactive substances. The following description of 
an early experiment by Rutherford will bring out the essential features. 
Using the scintillation method in which the flash produced by the 
impact of an a particle on a fluorescent screen is used to count the 
number of such particles emitted by a radioactive substance in a 
given interval of time, Rutherford in a certain experiment counted 
10,097 particles emitted over a period of 326 minutes. For conveni- 
ence he divided this period into 2,608 subintervals of ^ minute each 
and noted the number of particles emitted in each subinterval. The 
results of the count are given in the following table. 

Number of subintervals 
Number of scintillations 

Number of subintervals 
Number of scintillations 





57 


203 


383 


525 


532 


408 


273 


HP 







1 


2 


3 


4 


5 


6 


7 




45 


27 


10 


4 





1 


1 






8 


9 


10 


11 


12 


13 


14 
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This means that, e.g., there were 525 subintervals in which 3 scintil- 
lations were observed, while there were only 4 subintervals in which 
11 scintillations appeared. Now if this is really a statistical distribu- 
tion, the calculated standard deviation should agree with the observed 
standard deviation. The former is simply (cf . 54) 



where here N = 10,097 and NI = Np = 10,097/2,608 = 3.87 = aver- 
age number of scintillations per Y% minute subinterval. Hence p = 
l/2,608,g ~ land _____ 

<r = V3.87. 

If now we compute the actually observed standard deviation, we have 

2 = 57(3.87 - O) 2 + 203(3.87 - I) 2 + + 1(3.87 - 14) 2 

and get 




The agreement between <r obe and <r (about 5 per cent discrepancy) may 
be considered close enough to justify the use of statistical analysis in 
treating the problem. 5 Of course this does not mean that we have 
a right automatically to use the Laplace formula or normal distribution 
law to describe the distribution in detail ; if the Laplace formula were 
to be found not to apply, there would not necessarily be a contradic- 
tion. As a matter of fact, if we examine the assumptions on which the 
Laplace formula is based we see that it can be expected to hold only if 
neither Np nor Nq is too close to unity. Here, however, Np clearly 
violates this condition. Another type of approximation for Newton's 
formula (38) is then clearly called for. 

12. POISSON'S FORMULA 

If in the expression (38) we substitute rTi np = c, where c is 
a number of order unity we get 

(;,)"'(' -T- 

This can be written in the form 



5 For more recent precision observations on radioactive decay, reference should 
be made to L. F. Curtiss, Bureau of Standards Journal of Research, 8, 339 (1932). 
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Now let it be assumed that HI <3C n. Hence each of the parenthetical 
expressions like ( 1 -- ) is approximately unity and the same is 
true of (1 - c/n) ni . As we recall further that lim (1 c/n) n = e~ c 

n -oo 

and note that HI is, of course, finite, we have in the limit as n > oo 






HI HI\ 



This is called Poisson's formula. It is applicable as long as n\ is 
small compared with n. When plotted the Poisson formula gives rise 
to a skew curve as distinct from the symmetrical normal curve of the 
Laplace distribution formula. 6 However as np gets larger the skew 
curve approaches the symmetrical one. The reader may show that 
the distribution expressed by (64) represents rather closely that 
observed in the radioactive emission experiment discussed in Sec. 11. 
For comparison the corresponding normal distribution should also be 
computed. 

13. THE THEORY OF ERRORS 

A review of the elementary applications of probability and statis- 
tics to physics would scarcely be complete without a reference to the 
theory of errors which is fundamental for the estimation of the validity 
of all physical measurements. 

In the performance of any quantitative experiment the aim is to 
secure maximum significance of the result by reducing to a minimum 
all extraneous disturbing influences. Thus in the experimental study 
of the relation between the pressure and volume of a gas at constant 
temperature it is essential that the temperature remain really con- 
stant throughout a whole series of observations of pressure and 
volume. This is a problem demanding precise experimental tech- 
nique. The great progress in accurate measurement has come from 
the development of such technique. When all precautions have been 
taken, however, it still remains true in every measurement that the 
apparently precise repetition of a particular operation under appar- 
ently identical conditions will rarely yield the same numerical result. 
What numerical value is then to be chosen to represent the quantity 
being measured? It is the task of the theory of errors to answer 
this question. 

8 For a figure showing the relation between the Poisson formula and that of 
Gauss (Fig. 2-1), cf. T. C. Fry, "Probability and Its Engineering Uses," p. 239, 
D. Van Nostrand, 1928. 
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Let the quantity being measured be denoted by z and let a set of 
measurements of z under presumably carefully controlled experimental 
conditions be zi, z 2 , z n . It is customary to ascribe the differences 
among the z's to accidental errors as distinct from systematic errors 
which can be guarded against or accounted for by proper manipula- 
tion of the measuring apparatus. From the set of n values it is neces- 
sary somehow to produce a value which shall stand as the final ''cor- 
rect" or acceptable one. It is most natural to assume that this will be 
some kind of average of the z's. The simplest type of average is the 
arithmetical mean, i.e., 

I = Zl+Z2 + +**. (65) 

Let us for the moment adopt this as the acceptable value of the 
measured quantity. The quantities 

A; = Zi I (66) 

then represent the fluctuations or deviations from the mean of the 
various measured values. It is an observed fact that if in any care- 
fully performed experiment we plot as ordinate the number of values 
as a function of the deviation from the mean, a frequency curve is 
obtained which, although it differs in detail for different experiments, 
nevertheless always possesses certain definite general cheiracteristics. 
Strictly speaking, of course, the deviation from the mean is not a 
continuous set of values. What we do is to divide the total range of 
deviation into a set of equal intervals, i.e., from to +a, from +# to 
+2a, etc. In the middle of each interval is plotted the number of 
measurements for which the deviation falls in this interval. If a 
smooth curve is passed through the resulting points as n is made 
sufficiently large, the result generally resembles the Gauss probability 
curve in Fig. 2-1 in the following respects: (a) there are many more 
values for which the magnitude of the deviation A is small than there 
are for which the magnitude of the deviation is large ; (b) the number 
of values for any particular positive deviation interval tends to approxi- 
mate the number of values for the corresponding negative deviation 
interval. Thus there always tends to be a maximum in the curve 
near A = though, of course, there may also be subsidiary but lower 
maxima. 

If there were no accidental errors involved in the measurement we 
should expect the same value z to result from every observation. 
The differences among the z's may therefore be called errors and the 
frequency curve above described may be called an error curve in 
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which the ordinate gives the number of cases in which the error lies 
within a particular interval, say from A to A + dA. By division 
with n, the ordinate may further represent the probability of an error 
lying within the given interval. The "normal" or Gaussian error 
curve is then represented by the equation 

P(A)dA = -4= e~"d&, (67) 



where P(A)dA is the probability that the error of an observation shall 
lie in the interval A, A + dA. The quantity h is called the "measure 
of precision.'* Its value depends on the spread of the measurements. 

An important task of the theory of errors is to show under what 
conditions the expression (67) is justified. Many derivations of the 
normal law have been given based on a variety of fundamental 
hypotheses on the nature of accidental errors. 7 We shall not repeat 
any of these here but shall only show the intimate connection between 
the normal law and the arithmetical mean. This will indeed involve 
the demonstration that the assumption that the most probable value 
of a measured quantity is the arithmetical mean, leads directly to the 
normal law. 

Let us suppose that the probability that the error in a measured 
quantity shall lie in the interval A, A + dA is /(A)dA, where /(A) is 
the error function whose form we are seeking. If the n observed 
values of the quantity in question are Zi, z 2 , -z n , respectively, and z 
is assumed to be the actual "correct" value, the errors in the various 
measurements are 

AI = zi z, A 2 = z 2 2, - - A n = z n - z. (68) 

The probability that an error shall lie in an arbitrarily small but 
definitely assigned region in the neighborhood of A; is 



where K is a constant representing the size of the region. It is strictly 
the value of dA;, but we are practically agreeing to take all dA; oi the 
same size and call their magnitude K. The probability that in the set 
of n measurements we shall make the n errors A; will then be 

P = n/(A<). (69) 



7 For one such derivation, cf. Lindsay and Margenau, "Foundations of Physics," 
pp. 181 ff. 
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Since z has been assumed to be the * 'correct' 1 value, the only meaning 
we can give to this statement is that P shall be a maximum for z. The 
condition for this is, however, 

(7Q) 



fa /(A,-) d&t dz 
From (68) we have 

^ =1, for all z. (71) 

dz 

Hence the maximizing condition is 

y;J-w*)-o. (72) 

4^ /(Ai) d A 

Let us now assume that z is the arithmetical mean. The condition 
(72) is then subject to the auxiliary condition 



'=! 

By the method of Lagrange's multipliers (described and used in Sec. 2 
of Chapter IV) we can now express the conditions (72) and (73) in 
the form 

^U / 1 jt( A A \ 

i = 0, (74) 

where X is an undetermined multiplier. In order to satisfy this condi- 
tion we have to set, for all i 

(75) 




The solution of this set of differential equations is 

/(A,-) = Ce~^\ (76) 

By employing the condition that P shall be a maximum and not a 
minimum it can be established that X is positive. Equation (76) is 
equivalent to the normal error law. It only remains to evaluate C. 
Since all errors must lie between oo and +00 , we must have 



/+00 ~+ 

/(A)<M = C I 
30 J 00 



= 1. (77) 
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This fixes 

C - V|. (78) 

If we let X/2 = A 2 , we have 

/(A) - * ,- w * f (79) 

VT 
which is equivalent to (67). 

The arithmetical mean has another important property. Let us 
form the so-called "residuals" by subtracting z from each measured 
value. Thus we have the set of quantities 

fi = Zi Z. 

Now note that if we formed similar quantities for any other value, say 
z, we should have 

n n 

n(z - i)2 - (80) 

When z = 5, the sum of the squares of the residuals is least. This is 
the basis of the so-called method of least squares. 

The significance of the parameter h 2 in the normal error law 
becomes greater when we inquire as to the average error in a set of 
measurements of a single quantity. There are various ways of defin- 
ing such an average, just as we found many ways of defining average 
deviation in the earlier sections of this chapter. Perhaps the most 
valuable average is the mean square average error. This is defined as 

h r +<X) 

A 2 = 4= A 2 <r*' A '</A. 



The evaluation of the integral (cf. eq. 32) gives 

~i _ J_ 
A ~2A 2 ' 

The root-mean-square average error is then 



(82) 

This serves to reinforce the meaning of A as a measure of precision, 
since the larger h is, the smaller is the root-mean-square error in a series 
of measurements of a quantity. 

The question now arises: Can we give an estimate of the error 
involved in the arithmetical mean itself? We can interpret (69) as the 
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probability that any particular value z shall be the "correct" value 
from the set of measurements z\ z n . This probability can be 
written 



. (83) 

7T 

But from (80), this can be put into the form 

POO = ^ -*"--' (84) 

7T 

We can include e~ h * Xr% * with the multiplicative constant and write 

P( Z ) = a- n * 2( *-~* )a . (85) 

C is evaluated in the usual fashion, i.e., 

x-f 00 

C / e-^'-'^dz = 1, (86) 

J 00 

whence 



C = AV-- (87) 

The probability associated with the arithmetical mean is therefore 
from (85) 

P(z) = \^A. (88) 

This means that the probability that the arithmetical mean shall 
represent the "correct" value for a quantity grows with the square 
root of the number of observations of the quantity. We can go 
further and express the mean-square error associated with the arith- 
metical mean z in the form 

. +00 



L 



(z - z) 2 P(z)dz = - (89) 



The root-mean-square error associated with the arithmetical mean is 
thus 

(90) 



\/2h \/n 

Since we have already seen that the root-mean-square error associated 
with any one of the measured values is ,- (eq. 82) it follows that the 
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arithmetical mean is \/w times as accurate as any one of the measured 
values. 

The statistical theory of errors is now a very large subject to 
which we cannot begin to do justice in the above brief survey which 
has confined itself exclusively to the normal law and has disregarded 
other types of error distributions found in practice. A good review 
of the whole field will be found in the article by W. E. Deming and 
R. T. Birge, "On the Statistical Theory of Errors," in Reviews of 
Modern Physics, 6, 119 (1934). 



PROBLEMS 

1. Compare Stirling's formula for the integers 1 to 10 inclusive with the exact 
values of nl. Compute the absolute and percentage errors involved in the use of the 
formula. Do the same for log nl and in addition find the percentage error (for the 
range of n above specified) involved in using simply logw! = wlogn-w, i.e., neg- 
lecting log *\/2irn. 

2. In connection with the higher order average deviations from the mean in a 
distribution of n objects, prove that when k is even 



V? 



[1.3.5 



3. Use Stirling's formula directly to transform Newton's formula to Laplace's 
formula, i.e., eq. (26). 

4. In a coin-tossing experiment 10 coins (U. S. one cent pieces) were tossed 
1,100 times and the distribution of heads and tails noted after each toss. The results 
of this and another similar scries of tosses are presented in the following table. For 
the interpretation of the table it may be remarked, for example, that there were in 
the first series 198 tosses giving 6 heads and 4 tails and 207 such tosses in the second 
series. Compute the expected distribution from the algebraic formula (2'). Then 



Heads 


Tails 


Series 1 


Series 2 


10 











9 


1 


8 


11 


8 


2 


53 


45 


7 


3 


125 


134 


6 


4 


198 


207 


5 


5 


291 


274 


4 


6 


237 


233 


3 


7 


123 


136 


2 


8 


53 


51 


1 


9 


11 


7 





10 


1 


2 




Total 


1,100 


1,100 
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calculate the average deviation from the expected mean (without regard to sign) as 
well as the standard and probable deviations. Compare these with the actually 
observed deviations in the two series of experiments. Comment on the results. 

Noting that in the first series the total number of heads was 5,448 and the total 
tails 5,552, whereas the expected mean would be 5,500 of each, calculate the prob- 
ability (eq. 36) of a deviation as large as that observed. Do the same for the second 
series and then for the results of both series taken together. Draw from this what- 
ever conclusions you deem reasonable and plausible. 

5. In Problem 4, plot (1) the expected distribution from the algebraic formula 

for ( J , (2) the expected distribution from Laplace's approximation (26), and 

(3) the actual distribution. Do this for both series. 

6. Carry out the analysis leading to the generalized Laplace's formula (eq. 51). 

7. Apply Smoluchowski's expression for o-V Ni to the case of carbon dioxide 
at 31.89 C (cf. Int. Crit. Tables for data). Do the same for 100 C. Compare the 
result in each case with that obtained from the simple formula (54). 

8. Use Rajewsky's observed results on the shot effect to obtain the root-mean- 
square current fluctuation in his experiment. Could this scheme be used as a method 
of determining the charge on the electron? 

9. Plot the radioactive emission data of Sec. 11 as well as the corresponding 
Laplacian and Poisson distribution formulas and compare the experimental and 
theoretical distribution in each case. 

10. The median of a set of measured values is that value such that there are as 
many greater than it as there are less than it. Find the law of errors corresponding 
to the assumption that the median is the most probable value of the measured 
quantity. 



CHAPTER III 
REVIEW OF THERMODYNAMICS 

1. FUNDAMENTAL CONCEPTS 

In Chapter I we stressed the difficulties associated with the classical 
dynamical method of describing physical systems containing a large 
number of constituent particles. The alternative, the statistical point 
of view, is the one which will be followed out in detail in this book. It is 
well to recall, however, that there is a well-known dynamical method 
of describing the behavior of physical systems, particularly with rela- 
tion to heat. This is thermodynamics. It has been remarkably suc- 
cessful in correlating a vast amount of empirical data concerning the 
thermal changes of bodies. Its program has been accomplished without 
postulating a molecular constitution for physical objects; hence it 
has avoided the above-mentioned obstacles in the path of the precise 
application of dynamics to systems of many degrees of freedom. Since 
the object of much statistical reasoning in physics is to provide a basic 
theory in terms of which the facts of thermodynamics find a rational 
explanation, it is desirable at this place to review the fundamental 
concepts of this subject. 

Thermodynamics is a discipline which endeavors to describe the 
behavior of large scale bodies, particularly with reference to their ther- 
mal changes, by the use of dynamical concepts. As has already been 
emphasized, however, the method of attack is quite different from that 
of classical mechanics. Instead of visualizing a body as a system 
composed of a large number of material particles, whose motion is 
sufficient to account for its behavior, we consider the body as a whole, 
i.e., macroscopically. In particular its state is no longer defined in 
terms of component particles but rather in terms of large scale quan- 
tities, operationally defined. These are volume, pressure, and temper- 
ature, which are termed the fundamental state variables. Only two of 
these are independent, since for every body there exists a so-called 
"equation of state" (cf. Sec. 3) connecting them. Thermodynamics 
does not pretend to relate pressure, volume, and temperature for all 
conditions of bodies, but only those in which the system if left to itself 
will remain unchanged, i.e., what we shall call states of equilibrium. 
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Whenever the same state of equilibrium is reproduced the state vari- 
ables return to their previous values. In other words, they depend on 
the state alone and not on how the system got into the state. 

All thermodynamic changes of state are called processes. They may 
be reversible or irreversible. A process is reversible when by an 
infinitesimally small change in the parameters controlling the state, 
the system may be made to pass in either direction through a sequence 
of states, without any net change in the surrounding environment. All 
other processes are irreversible (cf. Sec. 3, Chapter I). A process that 
carries a system through a sequence of states back to the initial state is 
called a cyclic process. The Carnot cycle is a familiar illustration. 

Thermodynamics employs other concepts besides state variables. 
Thus in a cyclic process, a certain quantity of heat may be absorbed 
and a certain quantity of work done by the system. After the system 
has come back to its original state these quantities do not return to 
their original value; in fact there is no meaning to this statement. 
Quantity of heat absorbed or given up by a system is not a state vari- 
able; neither is quantity of work done by or on the system. Both 
these quantities depend vitally on how the system goes from its 
initial to its final state. It is clear that for a thermodynamical variable 
to be a state variable it must be expressible in terms of the funda- 
mental variables of state in such a way that any small change in it is a 
perfect differential of the corresponding changes in the state variables. 

2. THE TWO LAWS OF THERMODYNAMICS 

With these preliminaries out of the way we shall now recall that the 
theoretical basis of thermodynamics consists of two principles, called 
the first and second laws. The first law of thermodynamics comprises 
the assumptions that heat is a form of energy and that in any thermo- 
dynamic process there is conservation of energy. This can be written 
in symbolic form by the introduction of a new thermodynamic state 
variable, the total internal energy of the system, which we shall denote 
by E. The principle says that if a quantity of heat A<2 is added to a 
system and a quantity of work A^Fis done by it, the associated change 
in the internal energy AE is given by 



A = A() - AW. (1) 

We have already emphasized that Q and W are not variables of state 
or state functions. However, it is part of the content of the first law 
that E is a state variable. In other words, if a system is allowed to 
undergo a series of thermodynamic cyclic processes from a definite 
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initial state back to this same state, although AQ and ATF may be 
quite different for the different processes, experiment indicates that 
AQ APT is zero for all the processes, thus suggesting that AJ3 is 
always zero for a cyclic process and therefore that is a state variable. 
This experimental suggestion is erected into a definite postulate. We 
therefore use dE to denote a change in , implying that it is a perfect 
differential of the fundamental state variables, while we shall continue 
to use A<2 and AW to bring out that these are not perfect differentials. 
It is scarcely necessary to emphasize that in (1) all quantities are 
expressed in energy units by the use of the mechanical equivalent of 
heat, approximately 4.2 joules per calorie. 

The first law is too general to serve by itself as the single basis of 
thermodynamics. It is necessary to supplement it by another principle 
whose purpose is to express the direction in which thermodynamic 
processes take place. We have already commented in Chapter I on the 
irreversible nature of many natural processes, particularly those in 
which heat transformations are concerned. The principle (1), how- 
ever, will apply just as well to reversible as irreversible processes. We 
need therefore a state variable which changes only in one direction, i.e., 
either never increases or never decreases. Such a quantity is found in 
the entropy which is defined by its differential, i.e., 

dS = A<2/r. (2) 

In this definition it is understood that A<2 is the change in heat energy 
in a reversible process. The function S defined in this way is a state 
variable, though Q is not. This can be shown by the generalization l 
to the case of any reversible cycle of the fact that if in any Carnot 
cycle heat A<2i is taken in at temperature T\ and heat A<2 2 is given out 
at temperature T 2 , we have 



0) 



In order to state the second law of thermodynamics we need one 
more concept, that of a closed system. This means a system which has 
no interaction with its environment, i.e., it cannot gain energy from 
nor lose energy to its surroundings. With this and the foregoing in 
mind we can state the second law in the form : the entropy of a closed 
system never decreases. It may not change (as in a reversible cycle) 
but if it does change, it must increase, and this is always true for 



for example, Leigh Page, "Introduction to Theoretical Physics, 1 ' (second 
edition), p. 289. D. Van Nostrand Co., New York, 1935. 
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irreversible processes. In the latter case it is not possible to calculate 
the change in entropy from (2) : it is necessary to replace the irreversible 
process by a reversible one which has the same initial and final states. 
Then (2) can be applied. The reader will recall several conventional 
ways of stating the second law, e.g., that it is impossible for a self- 
acting engine continuously to convey heat from a body of lower temper- 
ature to one of higher temperature. All such statements may be 
shown to be logically equivalent to the one we have given above. 

The entropy may be used to give an alternative formulation of the 
first law (1). Thus 

dE = TdS - APT. (4) 

As we have already emphasized, the work APT depends on the nature 
of the process. However, it may be thought of as owing to the change 
in certain parameters 1 n which express the dependence of the 
system on its external surroundings. If the change in / is associated 
with a generalized force Fj, the work done in the change dfy is Fj dfy and 
we can write (4) in the form 



dE = TdS -Fjdtj. (5) 



It must be pointed out that eq. (5) does not refer to a closed system 
since it contemplates interaction of the system and its environment. 
Hence in (5) dS need not be zero or positive as is required by the 
second law for a closed system. 

It cannot be too strongly emphasized that the changes symbolized 
in (5) are those that take place between equilibrium states. Neverthe- 
less since E and S are state variables, eq. (5) will apply even when the 
change from one equilibrium state to another takes place irreversibly, 
i.e., through a series of non-equilibrium states. Can we talk about the 
entropy of a system when it is not in an equilibrium state? Certainly 
we cannot compute it by eq. (2), since that applies only to a reversible 
process and a closed system cannot reach a non-equilibrium state by a 
reversible process. It can, however, reach an equilibrium state from a 
non-equilibrium state by means of an irreversible process in which the 
entropy will increase. Hence we can say that the entropy of a closed 
system in a non-equilibrium state is less than its value in the equi- 
librium state toward which it proceeds. This is the basis of the state- 
ment of the second law in the form: the entropy of a closed system 
tends to a maximum value or the entropy in an equilibrium state is a 
maximum with respect to its value for all non-equilibrium states from 
which the given equilibrium state can be reached by irreversible 
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processes. Naturally the closed system, if in an equilibrium state, 
will not proceed by an irreversible process to a non -equilibrium state ; 
this is inherent in the definitions of equilibrium state and irreversible 
process. 

3. FREE ENERGY AND OTHER THERMODYNAMIC FUNCTIONS 

From the entropy, internal energy, volume, pressure, and tempera- 
ture other state variables can be formed. The most important of 
these are the following : 

(a) Helmholtz free energy (or Gibbs' S function) 

* = E - TS. (6) 

In future applications, when we speak of "free energy" it will be this 
function which is meant. 

(6) Enthalpy (or Gibbs' X function) 

X = E + pV. (7) 

(c) Gibbs free energy (or Gibbs' Z function) 

Z - E + pV - TS = X - TS = * + pV. (8) 

For the statistical interpretation of thermodynamics the Helmholtz 
free energy is the most important, though for most applications of 
thermodynamics, the Z function is more significant. As eq. (8) 
indicates, Z and ^ are closely related. 
From (6) we have 

d* = dE - TdS - SdT. (9) 

But (5) can be used to transform this to 

d^f = - SdT - ^2 Fj d & ( 10 ) 

3 

For an isothermal process (dT = 0) eq. (10) says that there is a 
decrease in free energy equal to the external work done by the system. 
If the only way in which the system can do work on its surroundings 
is to change its volume against external pressure p and if this process 
is isothermal, we have SFydfy = pdV and hence (10) becomes 

d^f = pdV, (11) 

or 

/^T/\ 

(12) 



46 REVIEW OF THERMODYNAMICS [Cn. Ill 

This important relation is called the equation of state of the system. 
If ^ can be found as a function of V and T, it enables us to connect 
p, V, and T for the system. As an illustration, we shall later see that 
a statistical analysis yields for the free energy of an ideal gas of .A/ 
particles occupying volume V at temperature T 

* =- NkTlog V+K, (13) 

where k is the so-called Boltzmann gas constant (k = 1.37 X 10~ 16 
erg/degree C) and K is an arbitrary additive constant independent of 
volume but not necessarily independent of temperature. The com- 
bination of (12) and (13) gives 

pV = NkT (14) 

as the equation of state for an ideal gas. 

4. SOME THERMODYNAMIC RELATIONS 

Important thermodynamic relations can be deduced from the fact 
that dE, d^, dX and dZ are perfect differentials of the state variables. 
Thus from (10) it follows that 



From the first of the expressions in (15), the free energy itself can be 
written in the form 



This is sometimes called the Gibbs-Helmholtz equation. By differen- 
tiating S in (15) partially with respect to / and Fj with respect to T 
we obtain 

'?*) , (Ji) . 

\db/T \dT/ f . 



(17) 



When the y reduce to a single parameter, namely F, and the corre- 
sponding FJ becomes p, this is known as one of Maxwell's thermo- 
dynamic relations 2 Three other relations can be derived by express- 
ing in similar fashion the fact that dE, dX. and dZ are perfect differ- 
entials. They are set down here for convenient reference. The first 
is (written in general form) 

/dT\ _ /dF,\ 
i _ j i _ j 

2 Cf. op. 7. f p. 296. 
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The other two are usually written only for the case in which Fj reduces 
to p and & becomes V. Then they are 



The heat capacity is an important concept which we shall meet 
later. By definition the heat capacity of a substance at constant 
volume is the heat absorbed per degree change in temperature at con- 
stant volume, or 



r r 

Cv = = T = 



The last step came from the first law (5) with SFydf/ = pdV. We can 
also define a heat capacity at constant pressure, thus: 

If the parameters reduce to the volume alone and Fj becomes the 
pressure p, (22) may be written 



, , (23) 

The general formula 



becomes the well-known formula 

C p - C v = Nk (25) 

for the special case of an ideal gas. When applied to one gram of sub- 
stance, the heat capacity becomes the usual specific heat, denoted by 



and c p . 



5. 



With the basic principles discussed in the preceding sections of this 
chapter it has proved possible to give a coherent description of a large 
number of physical and chemical phenomena. This is not a textbook 
of thermodynamics and hence we shall not pursue the purely thermo- 
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dynamic method of description in a detailed fashion. 8 Rather we are 
now interested in seeing how the statistical point of view provides an 
interpretation of thermodynamics in terms of the atomic constitution 
of matter. Several different ways of doing this have been devised. 
Those which we shall consider in this book are : (a) the classical sta- 
tistics of Maxwell and Boltzmann; (b) the classical kinetic theory; 
(c) the statistical mechanics of Gibbs; (d) the statistical mechanics of 
Darwin and Fowler; (e) the quantum statistics. In each case the 
statistical theory strives to set up a number of statistical quantities 
analogous to the state variables of thermodynamics and beginning 
with very general postulates to derive a set of relations among them 
which can be interpreted as physically equivalent to the thermo- 
dynamic relations we have just discussed. The hope exists, further- 
more, that the statistical point of view will provide an even deeper 
understanding of physical phenomena than thermodynamics by sug- 
gesting laws which are not susceptible of thermodynamic derivation. 

Our method of procedure will be, to a certain extent, an historical 
one ; we shall examine the older physical statistical methods first. This 
is the natural order, for the more recent quantum statistics employs 
the same fundamental ideas as the earlier statistics and it will be desir- 
able to have the latter firmly in mind before proceeding to the former. 
It is hoped that in this way the reader will get a clearer view of the 
whole subject than if we tried to adopt a unified point of view and 
abandoned the historical approach entirely. We shall indeed find that 
the different methods of presenting classical statistics lead essen- 
tially to the same result when applied to the same problem. Some 
may therefore take the stand that a discussion of all methods is super- 
fluous. On the other hand the greater the number of ways in which we 
can look at a problem the more profound and thorough should be our 
understanding of it. 

PROBLEMS 

1. In the schematic diagram A and B represent equal volumes. A is occupied 
by a mass m of an ideal gas at pressure p and temperature T, while B 
is a perfect vacuum. Calculate the change in entropy which results 
when a hole is opened in the partition between A and B, allowing the 
gas to move freely from A to B. 

3 At this point the reader may wish to consult any one of a number of standard, 
more or less elaborate, treatments of thermodynamics, e.g., " Textbook of Thermo- 
dynamics," by P. Epstein, John Wiley and Sons, New York, 1937; " Thermody- 
namics," by E. Fermi, Prentice-Hall, New York, 1937; "Heat and Thermodynamics," 
by M. W. Zemansky, McGraw-Hill, New York, 1937. Of particular interest to 
physicists in view of the discussion in the later chapters of the present volume is 
P. W. Bridgman's "Thermodynamics of Electrical Phenomena in Metals," Mac- 
millan, New York, 1935. 
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2. Compute the change in entropy involved in mixing 1,000 grams of water at 
80 C with 500 grams of water at 15 C. Assume that the specific heat of water is 
constant and equal to 1 cal/gram degree C. 

3. Two grams of hydrogen have an initial volume corresponding to a pressure of 
76 cm of Hg and temperature 20 C. The volume being kept constant, the gas is 
heated to temperature 80 C. It is then allowed to expand at constant tempera- 
ture to double its original volume. How much heat has been absorbed by the gas 
and how much work has been done on it? Determine the same quantities when the 
gas is first allowed to expand to double its volume, the temperature being kept con- 
stant at 20 C, and is then heated at constant volume from 20 to 80 C. In each 
case also calculate the total change in the internal energy of the gas. 

4. From eq. (17) of this chapter derive the Clausius-Clapeyron equation, viz., 



where L = latent heat of vaporization and (AV S )T is the change in specific volume 
between liquid and vapor phases at constant temperature T. 

If the specific volume of water at 100 C is 1 cm 3 /g r am, while that of steam is 
1,686 cm 3 /gram find the change in the boiling point of water produced by lowering 
the pressure by 10 cm of Hgin the neighborhood of 76 cm of Hg. 

If the difference in specific volume between the liquid and solid phases of water 
at C is 0.1 cmVgram, find the depression of the freezing point of water associated 
with an increase in pressure of one atmosphere. 

5. It is shown in Chapter IV (eq. 84) that the free energy of an ideal gas with 
N particles in volume V at temperature T is 



Use eq. (16) of this Chapter to find the expression for the total energy of the gas. 

Given one mole of an ideal gas under standard conditions of temperature and 
pressure. Find the change in free energy if the volume is doubled at constant tem- 
perature. Find the change in free energy if the temperature is raised by 1 C while 
the volume is kept constant. 

6. Derive the expression for the entropy of an ideal gas from the application of 
Problem 5 to eq. (15) of this chapter. 

7. By applying eq. (25) of this chapter to the special case of hydrogen, show how 
the mechanical equivalent of heat may be calculated. 



CHAPTER IV 
CLASSICAL MAXWELL-BOLTZMANN STATISTICS 

1. STATISTICAL DISTRIBUTION OF N OBJECTS IN /i GROUPS 

The fundamental problem of what has come to be called the 
Maxwell-Boltzmann statistics is the following: Given a large number 
of objects, N, e.g., molecules of a gas, it is desired to distribute 
these with respect to some property they all possess, e.g., position in 
space, velocity or kinetic energy. It will be convenient to think of 
this property as associated with a set of \i boxes, with a definite value 
of the property attached to each box. We shall first assume that the 
objects are indistinguishable, that they move freely and exert no 
forces on each other, and that it is just as likely that a particular 
object shall lie in one box as in any other. We can then readily com- 
pute the probability^ of^an Arrangement in which there are A^ objects 
in the first box, N 2 in the second, N$ in the third, ", and JV M in the 
/ith, by finding^the number of independent ways in which this distri- 
bution can be achieved. From eq. (2') of Chapter II the number of 
ways of choosing NI of the N objects to place in the first box is simply 



Similarly the number of ways of choosing N 2 objects out of the remain- 
ing (N NI) to place in the second box is 

V ' 



N 2 l(N - NI - N 2 )\ 
Clearly then the total number of ways required is the product 
(N\(N-NA /N-Ni-N, ----- A- M _A 

UiA * )"\ *r, r 

which can be reduced to the following form 



[Nl = NI 

UJ NilNJ. N,l' 



For simplicity we shall use the square bracket to denote this expression. 
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[Nl 

divided by the __total number of ways o distributing the N objects 
among the A* boxes~^thout regard TorThe number of objeqts in each 
box. By a simple generalization of the argument in Sec. 1 of 
Chapter II, the latter number is p, N . Hence the probability desired 
is simply 

~N~ 

P N = 4r 

* u. N 



(4) 

A* 

Equation (4) can be immediately generalized to the case where the 
a priori probabilities of each box are not equal. Suppose that they 
are actually given by gi, g 2 , g^ respectively, where, since we here 
assume tha^ these are actual mathematical probabilities, we have 

(5) 

Then the probability that NI objects are in the first box, N 2 in the 
second, etc., if there were only one way in which to realize this distri- 
bution, would be 



Since however the number of ways of making this distribution is 
, the total probability becomes 

* l gz* ' g*"- (6) 



This reduces to (4) when gi = g2 = = ft = !//* 

We now ask the following question: Given an aggregate of N 
objects whose number remains constant, for what type of distribution 
among n boxes will P^f have the maximum value? Intuition answers 
the question by saying that the required distribution is that for which 
the number in the jth box is 

Nj = Ngi. . (7) 

Let us see whether we can confirm this conjecture. The problem is 
to make P^ a maximum subject to the condition 



j = N = constant. (8) 

Choose any set of N, satisfying the condition (8). If we can find the 
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conditions which gi gj g? must satisfy in order to make Pf? 
a maximum subject to (5), we shall have solved the problem. We 
write (6) in the form 

logP? = logM + Stylog & , (9) 



and find it simpler to make log P^ a maximum subject to the con- 
dition (5). But log Pf? is stationary if any slight but arbitrary 
variation in the gj leads to zero variation in log P^. The meaning 
of log P^ assures that in the present case the stationary value will be 
a maximum. Hence the mathematical formulation of the problem is 

d log P? = 0, (10) 

subject to 



where we use the symbol d to denote an arbitrary variation. We 
then have 

^ - (12) 

& 

subject to (11). Let us solve (11) for dgi in terms of the rest 



&- (13) 

y-2 

Substitute this into (12) and obtain 

/ AT AT \ 

= 0. (14) 

In this expression dg 2 , dg^ are completely arbitrary, since by giving 
them any consistent values (proper fractions, of course) we can still 
choose dgi to satisfy the condition (11). Hence the only way to satisfy 
(14) is to have identically 



1 2 &i 

The fact that the constant ratio is N is, of course, a result of (5) and 
(8). This checks the intuitive deduction. 

We have indeed carried out the deduction (15) somewhat indi- 
rectly. In actuality the gj are fixed quantities and the distribution 
sought is the set of Nj for which P^ is a maximum subject to (8), 
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Suppose, however, that we have the correct set of Nj. If we substi- 
tute them into (9), to secure the required maximum we must still 
have (15) satisfied. As a matter of fact we could start with (9) and 
vary the Nj subject to (8) keeping the gj fixed. This involves using 

[N~] 
Stirling's formula for the factorials in . It will yield the result 

(15), as the reader can show, but the method we have used is simpler. 
The reader should see the connection between the distribution 
given by eq. (15) and the special one treated in Sec. 9 of Chapter II 
and expressed in eq. (53) of that chapter. 

2. THE CANONICAL DISTRIBUTION 

We now wish to generalize the distribution problem of Sec. 1 by 
assigning to each box a certain property which will be possessed by any 
object in that box. To fix the ideas clearly, let us imagine that the 
objects are material particles and the property is kinetic energy. We 
again assume that the particles are free. The energy Ej will be 
assigned to the jth box : whenever a particle is in that box it possesses 
precisely this energy value. The total energy of the aggregate of 
particles is thus E = 2NjEj. The problem is now to distribute the 
particles among the boxes in such a way that the distribution proba- 
bility P% will be a maximum subject not only to the condition of 
Sec. 1, viz., that the total number of particles remains constant, but 
also to the additional condition that the total energy of the aggregate 
remains unchanged. It will be assumed that the boxes have the 
a priori probabilities g\ g M as in Sec. 1. 

As before we shall work with log P^. From (9) using (3), we have 

log P* = log N ! - S log Nj ! + Stfy log gj. (16) 

The analytical formulation of the problem is 

51ogP? = 0, (17) 

subject to 

dZNj = 0; dZNjEj = 0. (18) 

In all sums, unless otherwise specified, it will be assumed that j runs 
from 1 to M- 

From Stirling's formula (eq. 25 of Chapter II) 

log N\ = N log N - N + -J- log 27r + log N. (19) 

If N is sufficiently large the terms Y% log 2ir and ^ log N are negli- 
gible compared with the first two terms on the right. We shall con- 
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sistently neglect them for the sake of simplicity. This might seem to 
be an invalid procedure in log Njl when Nj is small, as it can quite 
possibly be. However the error involved will still be small even here, 
because the terms log Nj I for small Nj make only a very small con- 
tribution to the whole sum. Condition (17) now becomes 

2 (log |;) Ay = 0, (20) 

and conditions (18) 

ZdNj = 0; ZEjdNj = 0. (21) 

Note that the Ej are definitely fixed and the only possible variation 
in energy comes from the SNj. The maximization problem is now car- 
ried out by the use of Lagrange's undetermined multipliers. 1 We 
choose the initially arbitrary constant multipliers 71 and y 2 and, 
remembering (20) and (21), write 

S (log ^ + 7i + 72^y) Ify = 0. (22) 

We now pick 71 and 72 to satisfy the equations 
log-- +71 +7 2 i = 0, 



(23) 
log f- + 71 + 72^2 = 0. 

jy% 

This we have a right to do, since 71 and 7 2 are arbitrary. Equation 
(22) now becomes 



log - + 7i + y a Ej Sty = 0, (24) 



and the variations 67V 3 , - -, dN^ occurring here are now completely 
arbitrary. For all j, therefore, we have 

log ~r = - 71 ~ 72^;, 

NJ 
or 

Nj = &"+-<*'. (25) 

1 Cf., Leigh Page, "Introduction to Theoretical Physics," second edition, p. 311. 
D. Van Nostrand Co., New York, 1935. 
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It is now convenient to replace 71 and 72 by two new parameters \fr 
and defined by the equations 

7i = g + log N, 

(26) 

1 
72= -' 

The distribution formula (25) then takes the form 

JV- no-NpMp'~ E i / * O*l\ 

IVj fJlgjJLW (/ \A I J 



The parameters \p and can at once be determined, at least in prin- 
ciple, as follows. We have from (8) 



whence 

* = -01ogSM&e~*' /e , (28) 

which gives \l/ in terms of . To get the latter consider the total energy 
E = 



= N -/e (29) 



In this way we see that \l/ and are expressible in terms of E, N, gj 
and EJ. The actual evaluation of ^ and from these transcendental 
equations cannot, of course, be carried through in closed form. We 
shall hope, however, in the subsequent discussion to give them a 
physical interpretation. Incidentally (29) also provides an expression 
for the average energy per particle in the distribution, viz., 

s?- 

The distribution defined by (27) is usually termed a canonical 
distribution . Since 

e~*'* = S^-* /e , (31) 

an alternative way of expressing this type of distribution is clearly 



This eliminates the parameter \f/ from the distribution formula. The 
parameter is termed the distribution modulus. It is clear that it 
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must have the dimensions of energy and so must ^. We have said 
nothing so far about the sign of which might be considered arbi- 
trary. However, from (27') it is clear that negative would lead to 
indefinitely large values of Nj for increasingly large Ej. As a matter 
of fact, investigation shows that must be positive if condition (17) 
is to correspond to a maximum rather than a minimum of logP^f. 
By disregarding gj, positive makes the number of particles in the 
jth box decrease exponentially with the energy assigned to that box. 
Of course the distribution is affected by the choice of a priori proba- 
bilities gj. 

Our understanding of the canonical distribution law (27') will be 
enhanced by the consideration of a few special illustrations. First 
suppose that gj = I/M for all j and Ej = E/N for all j. Inspection of 
(27') for this case shows that for all j 



corresponding to a uniform distribution. If we forego the restriction 
on gj, save for Sg/ = 1, the resulting distribution becomes 



i.e., the same as that already studied in Sec. 1. 

For the second illustration imagine that Ej is an integral multiple 
of a certain fundamental energy unit 8, i.e., Ej (j 1)8 where j, 
as usual, takes the values 1,2,3, .... Further assume that ju is so 
large that we can effectively consider the sums over j from 1 to n as 
infinite series. For simplicity we shall suppose that the gj are all 
equal. Therefore since e~~ 8/e < 1 



Moreover by a simple extension of this 



In the sums j runs from 1 to oo . Consequently (30) yields for the 
average energy 

(32) 

We shall later have occasion to appreciate the larger significance of 
(32) (cf. Sec. 10, Chapter VIII), but for the present let us use it as a 
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means of getting a somewhat clearer light on @. Solving (32) for 
yields 

e - ' .. (33) 



Suppose that 8 < < E, which is not too far-fetched an assumption. 
Then we have the interesting approximation 

@ ~ E, (33 X ) 

or the modulus is approximately equal to the average energy per 
particle. We shall investigate later a generalization of (33'). From 
(28) we now have for the parameter \l/ 

_ 8 1 

which yields approximately 

E 

^ r^ jg Jog (34) 

& 

If we choose to set ^ = N\{/, we can further write this 



in terms of the total energy. The distribution formula for this special 
case is 

^y = ^- '- 1)e/e -(l -*- e/e ), (35) 

or approximately 

NJ - jVe- y - 1)Are/ *. (1 - e~ Ne/E ). (36) 

An illustration of more practical importance than the above is pro- 
vided by a collection of linear simple harmonic oscillators, all with 
frequency v. It is shown by quantum mechanical reasoning 2 (also 
cf. Chapter VIII) that the possible energy values of a linear harmonic 
oscillator are given by 

y= C7'-i)fo% J = 1,2,3, . (37) 

where h is Planck's constant of action (6.55 X 10~~ 27 erg sec). The 
a priori probabilities are all equal. Assuming again that the number 
of states is very large, we have 

.-AV20 

. (38) 



3 Cf. Lindsay and Margenau, "Foundations of Physics," p. 430. 
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Consequently the distribution function takes the form 

NJ = ^6~'"" 1)we .(l - e~ hv/ *) (39) 

which is identical with (35) with 8 = hv. However, a somewhat dif- 
ferent and very interesting situation presents itself with regard to the 
average energy per oscillator. If we differentiate (38) with respect to 
the result is 

-SE +-*>''* - . *-**/ 2e . (* + e ^ 

^>e - 2 (1 - <T A * /e ) 2 

and from (30) this gives 

fa (1 + -*") h, h, 

* ~ 2 (1 - -*") ~ 2 + e h ^ - 1 

The significant feature of this result is that, unlike (32), the average 
energy no longer vanishes when becomes zero. In fact 

hv 
()e-o = y = -Ei. (42) 

We shall later learn to call this a " zero-point " energy. It is clear 
that this situation will always arise when the lowest possible energy 
value is different from zero as it is in the present case. 

In the collection of simple harmonic oscillators considered above 
the a priori probabilities are all equal. It turns out that if the collec- 
tion contains two dimensional harmonic oscillators, this situation no 
longer prevails. Rather the gj increase linearly with j, so that we can 
write 

& = ~, 0= 1,2,3,.--) 

where C is a constant depending on the number of energy boxes. 
Division by this constant is necessary to make gj a probability in the 
sense in which we are using the term. Actually this constant plays no 
role so far as the average energy is concerned. Irrespective of the 
value of C, eq. (30) gives for the average energy 



Quantum mechanics yields for the two-dimensional oscillator 

Ej = jhv. 
Consequently 



SEC. 3] PROPERTIES OF THE CANONICAL DISTRIBUTION 59 

Now by differentiating 2e~ jhy/e = l/(e hv/e - 1) with respect to 0, 
we secure 

2 >~ ;We = (*%'* .* (45) 

\ 6 L) 

A second differentiation yields 
Consequently 



C/ 



(47) 



This should be compared with (41) and the change in the " zero- 
point " energy noted. 

3. PROPERTIES OF THE CANONICAL DISTRIBUTION. INTERPRETA- 
TION OF THERMODYNAMIC RELATIONS 

It is now in order to study some properties of the canonical distri- 
bution and to indicate analogies with the important thermodynamic 
relations. 

Let us evaluate the expression for log P^ (eq. 16) in the case of a 
canonical distribution. We have 



logP* = NlogN - N - SUVylog^ 2 ' - tf/ 

V & 



(48) 



Substitution of Nj from (27) yields ultimately on reduction, if we 
denote the canonical value of P^ by P c , 

log N P C = ^-^' = log w f (49) 

where we have again replaced N$ by ^', and where in what follows 
we shall always consider n N P c replaced by w. We shall refer to the 
latter as the " statistical probability >f for a canonical distribution. 
We shall now imagine that the system of particles undergoes a 
change in its total energy, taking place in two ways, viz., (a) by a 
change dEj in the amount of energy associated with each box, and 
(b) by a change in the distribution over the boxes, the change in the 
number of particles in the jih box being denoted by dNj. The total 
change in energy then appears as 

dE = dZNjEj = ZEjdNj + ZtytZEy. (50) 



There is another way of looking at this which is rather expeditious 
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and moreover will prove very useful when we come to alternative 
presentations. 

Let us introduce the transformation 

f = - 1/e . (51) 

We can then write from (29) 

E = 



We shall moreover find it convenient to denote 2pgj[ E ' by a special 
symbol and use Z for this purpose. Thus 

z = z/*af*'. (53) 

Now 



(54) 
o 
and therefore 

E.W'Ln 9 -^- (55 ) 

Z df S df V 

This proves to be a very useful mode of expression for the energy of 
the system. We may call Z the distribution function. Later we shall 
find something very like this called the partition function in the method 
of Darwin and Fowler (Chapter VII). We shall now consider the 
total change in E associated with a change in the modulus f (or ) 
and the changes in JSy, the latter being lumped in the change in Z. 
Now we shall assume that the function Z depends on f as well as on 
certain external coordinates, namely the f i f n already mentioned in 
connection with the thermodynamical treatment in Sec. 2, Chapter III. 
When we carry out the differentiation with this in mind we get for 
the total change in E 

,, (56) 



df 

Next we proceed to consider the work done in the change in the 
external coordinates. For a particle in the jth state in which the 

dEj 
energy is Ej our alteration in \ brings into play a force -- and 

d\ 

the work done by this particle when all the parameters change by 
d%\, d 2 ' d n is 
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Now the average number of particles in the jih state is Nj, given by 
(27) or (27'), which by employing (53) can be written in the form 



Consequently the contribution to the work by all particles in the jth 

ZdEj 
d\ and that by all systems in all states is clearly 

d 



^ < 59 > 

j ^ 

From ihe fact that 



(60) 

we can write 






Therefore the change in the total energy of the system plus the external 
work done by the system is 



Let us now go back to the expression (49) for log w. From (28), 
(51) and (53) this may be written 

log w = - E log + AT log Z. (63) 

The change in log w when and the \ alter therefore is 

dlog w =- dEIogf - ^ + AT^# + ^^rffi, (64) 

^ 

From (55) it is clear that the two middle terms in (64) cancel each 
other, so that finally 



Z 
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If we divide through, by logf we get 
dlogw JV 

~~ 

But from (51) and (61) this at once becomes 

Qdlogw = dE + dW. (65) 

This very interesting relation evidently bears a close analogy to the 
first law of thermodynamics (eqs. 1 and 4, Chapter III). We can 
indeed look upon it as the statistical equivalent of the first law if we 
are willing to identify the increment of quantity of heat transferred 
to the system of particles as d log w. Let us then write 

AO 

-~ = d log w. (66) 

Now the statistical quantity log w has the property that its value 
depends solely on the state of the system in its canonical distribution 
and does not depend on how the system got into that state. It is 
therefore competent to serve as a statistical analogue of a thermo- 
dynamical state variable. In fact we see from (66) that if we interpret 
the statistical canonical distribution modulus as a universal con- 
stant k times the absolute temperature T, we can write 8 

AO 
d[klogw] =-^- (67) 

Comparison with eq. (2) in our discussion of thermodynamics in Chap- 
ter II suggests that we interpret the left side of (67) as the change in 
entropy of the system and write 

d[k log w] = dS. (68) 

The integration of (68) then yields for the statistical interpretation of 
entropy 

S = k log w + C, (69) 

where C is an arbitrary additive constant of integration. We are 
entitled to choose for C the constant quantity which will make (69) 
agree best with the known thermodynamical properties of the entropy. 

8 k turns out to be the Boltzmann gas constant with the value 1 -37 X 10~ 16 
ergs/C (Cf. Sees. 1 and 2, Chapter V). 
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With this in mind we find it advantageous to choose C = k log Nl 
and write our statistical definition of entropy in the form 



(69') 



We may indeed proceed to define w/Nl as the " effective " statistical 
probability for the canonical distribution. Examination of the pre- 
ceding sections of this chapter discloses that if we had used it in place 
of w no change would have resulted in the canonical distribution. In 
Chapter VIII we shall see a close connection between the definition 
(69') in classical statistics and the quantum mechanical definition of 
entropy. It is well to emphasize that the relations (68) and (69') are 
dependent on the fact that the system of particles is canonically dis- 
tributed. Since the equations of thermodynamics refer to systems in 
equilibrium it is therefore appropriate to assume that the canonical 
distribution is the statistical analogue of equilibrium. This corre- 
sponds well with the fact that log P c (and likewise log w) is a maximum 
for a canonical distribution as compared with any other distribution 
of the same system of particles with the same energy. The same is of 
course true for log (w/N\). If there were a configuration of greater 
probability than the canonical distribution we should expect that the 
system would not rest in equilibrium until it had attained this more 
probable state. On this view a system with a fixed number of par- 
ticles and given energy not in an equilibrium state will correspond to 
a smaller value of log (w/N !) . But such a system will tend to approach 
a state of greater probability and this is the statistical interpretation 
of the irreversible tendency of the entropy of a closed system to 
increase; the latter has already been emphasized in the thermo- 
dynamic definition of entropy in Sec. 2, Chapter III. There is, to be 
sure, a fundamental difference between the second law of thermo- 
dynamics and the law based on the assumption (69) or (69'). Accord- 
ing to the second law, the entropy of a closed system always increases 
in a non-cyclic, irreversible process. According to the statistical inter- 
pretation it is only probable that the entropy will increase. From the 
very nature of the statistical definition of entropy there is no necessity 
that the entropy must always increase under the conditions stated. 
There will always be indeed a finite probability that it will decrease. 
This may be looked upon as the price we have to pay for the statistical 
interpretation. 

We can illustrate the above situation by attempting to treat the 
change in entropy during a given process, as statistically defined in 
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(68), as a measure of the irreversibility of the process. Suppose the 
system changes from state 1, where the statistical probability is w, 
to state 2 with probability w + Aw. It seems appropriate to take 
the measure of the irreversibility in going from 1 to 2 as the ratio 
(w + Aw)/w. We shall call this z. Then 

w -f- Aw 
log z = log = A log w, 

w 

whence 

A log w AS/A; 

As an example, consider the process involved in the passage of 1 erg 
of heat energy from a body at 21 C to another body at 20C brought 
in contact with the first. If we replace the process by an equivalent 
reversible one, i.e., one carried out ideally with infinite slowness, we 
can calculate the entropy change AS by (67) and get on substitution 
of the data AS/k = 8.5 X 10 10 . Hence in this case z > 10 1010 , indi- 
cating a high degree of irreversibility. It is rather interesting to con- 
sider what happens when the quantity of heat energy transferred is 
only 10~~ n erg. Then z = e 085 , which is less than 3. Indeed as the 
amount of energy transferred becomes smaller and smaller, z > unity 
and w for the two states approaches the same value. For very small 
energy transfers the process becomes less and less irreversible from the 
statistical standpoint. This casts a further interesting light on the 
significance of the statistical interpretation of entropy. 

It is now not a difficult matter to find the statistical analogues of 
the other important thermodynamic variables. Consider again the 
Helmholtz free energy (eq. 6 of Chapter III) 

* = E - TS. 
With the statistical interpretation of S this becomes 

* = E - kTlogw+.kTlagNl. (70) 

But now compare this with eq. (49). Let us set as usual 

= kT. (71) 

Then from (28) and (49) and the use of Stirling's formula 



'('ogf 



*= -NkT\lag- + lJ (72) 

is the statistical expression for the free energy in terms of the parti- 
tion function as we have introduced it in eq. (53). This of course 
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assumes that the constant k, whose value is not specified by the 
statistical theory, is chosen correctly (cf. Sees. 1 and 2, Chapter V). 
We should be able to use the expression (72) to derive the equation 
of state of the system from eq. (12) of Chapter III. Thus the pressure 
should be given by the equation 



(73) 

If therefore we were to derive the expression for the partition function 
Z for a system of free particles and in particular its dependence on 
the physical volume occupied by the system, eq. (73) should reduce 
to the well-known equation of state of a perfect gas, viz., 

pV = RT, (74) 

where R is the so-called gas constant. Now it is clear from the form 
of (73) that in order for it to yield (74) it is essential that 

Z = KV, (75) 

where K is independent of V, though it may contain the mass of the 
particles, the parameter 0, etc., and must have the dimensions of 
reciprocal volume. In the next section we attempt the calculation of Z. 

4. DISTRIBUTION OR PARTITION FUNCTION FOR A SYSTEM OF FREE 
PARTICLES. FREE ENERGY OF AN IDEAL GAS 

The problem is to obtain for an ideal gas the volume dependence 
of the function 

Z = 2M&e- '' /e , (76) 

where 

> = ^(*i + /& + /) (77) 

All particles have the same mass, m, and the pj x , pj y and pj z are the 
component momenta along the x, y, z axes respectively of a particle 
in the jth box. Since the gas is assumed to be ideal, the energy is 
kinetic only. Before we can use (77) to evaluate (76), we must intro- 
duce the appropriate a priori probabilities gj. From our previous 
discussion (cf. Chapter II, Sec. 9) it is reasonable to assume that gj is 
proportional to the size of the jth box. Clearly, however, size here 
does not refer merely to physical volume but also to momentum 
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interval as well. Let us suppose that a particle in the jth box has its 
configuration coordinates in the volume element A#y A^y Azy about the 
rectangular coordinates #y, y^ zy, and its momentum components in 
the momentum interval &pj x &pjy&Pjz in the neighborhood of the 
momentum components pj x , p jy , pj z . It is then natural to suppose 
that gj will be equal to AjcyA^yAzyA^yxA^A^y^ divided by the total 
" volume " of the six-dimensional space defined by the coordinates 
x jf 3V z j> Pix, pjyi pjz and whose total extent is given by the range of 
variation allowed to these coordinates. We have now to decide whether 
the coordinates shall be allowed to vary continuously or in discrete 
amounts. We shall make the assumption, which will prove of value 
in the later discussion of quantum statistics, that the six-dimensional 
space under consideration has a cellular structure in which the cell 
has volume A 3 , where h is a fundamental constant having the dimen- 
sions of momentum times displacement. This assumption means that 
no matter how much #y, yy, zy, pj xt pj y , pj z vary among themselves 

Axj&yjAzj&pjx&pjy&pjg > h 3 . (78) 

We shall now assume further that there are /x cells available. Hence 
finally 



_ 



Consequently the partition function becomes 
1 'V^N _ 2 2 2 

- r 3 4^ 

Now since the coordinates are all independent of each other and since 
the exponential term does not involve any configuration coordinates 
we can carry out the summation over the A#y, etc., independently of 

the momenta and have 

x > 

yA3>yA2y = F, (81) 



where V is the physical volume occupied by the particles. Since the 
Ay x , etc., can be made as small as we please subject only to condition 
(78), we shall assume that to a very high degree of approximation 
(h being assumed to be a very small constant) we can replace the 
summation over the pj x , etc., by a triple integration. It should be 
emphasized that this replacement of a finite sum by a definite integral 
is here purely a matter of mathematical convenience and has nothing 
fundamentally to do with the essential nature of the Maxwell-Boltz- 
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mann statistics. In the evaluation of the integral we shall further 
suppose that the limits for each variable are oo and + oo respec- 
tively. Actually there will be finite maximum and minimum values of 
the momentum components for any actual aggregate of particles. 
However the exponential integrand renders the choice of infinite limits 
a very acceptable approximation (cf. the remarks after eq. 30 in 
Chapter II). Now by a simple transformation and extension of eq. 
(23) of Chapter II, we have 

-{-00 s% -f 00 s* +00 

II e-^+ti+rt^dpzdpydp, = (2^m0)' / . (82) 

/ 00 / 00 

Hence v 

Z = -5 (27rw) 8/2 . (83) 

This is indeed of the form of eq. (75) in the last section and thus leads 
to the correct equation of state for an ideal gas by substitution into 
eq. (73). From the form of the partition function for an ideal gas as 
obtained in (83) it results that the free energy (72) is an extensive 
quantity, i.e., at given temperature it is directly proportional to the 
number of particles and does not depend directly on the volume of the 
system. In fact if we substitute from (83) into (72), the result for 
the free energy of an ideal gas is 



* =- NkT log!--.- TT-^-) + 1 ' (84) 

L \N nr / J 

If two systems are combined so that the density and temperature 
remain the same the free energy of the combination is the sum of the 
free energies of the individual systems. It is again assumed that the 
constant k is chosen correctly. 

5. EQUIPARTITION OF ENERGY IN A SYSTEM OF FREE PARTICLES. 
ENTROPY OF AN IDEAL GAS 

We can at once extend the results of the preceding section in an 
interesting fashion to obtain an expression for the average energy per 
particle of the system. If we differentiate 



Z = 
with respect to @ we obtain 

(2-m) 1 * @'". (85) 
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Hence (eq. 30) 



This significant result gives the average energy directly in terms of the 
distribution modulus. It is possible to interpret the equation as express- 
ing the equipartition of the kinetic energy among the various degrees 
of freedom of the individual particles. Each of the latter has three 
degrees of freedom. If we assign each degree of freedom on the aver- 
age the amount of energy 0/2, the total per particle is 30/2, which 
is the equivalent of (86). As a matter of fact this interpretation is 
confirmed by computing the average energy per particle for the com- 
ponent motion along the x axis only, which at once yields 0/2. From 
the symmetry of the integrand in (82) the same result will follow for 
the y and z degrees of freedom. We reach therefore the general con- 
clusion that on the average the kinetic energy of an aggregate of free 
particles in a canonical distribution is distributed equally among their 
degrees of freedom. It is important to emphasize that the result (86) 
depends for its validity on the approximation of sums by integrals. 
We shall see that this is true for any statistics in which discrete cells 
or boxes are employed. The situation is rather different in the method 
of Gibbs, which will be treated in Chapter VI. We should further 
emphasize that the system to which the equipartition principle applies 
is an aggregate of independent particles, possessing kinetic energy 
only. In any case we shall find the equipartition principle of con- 
siderable value in our future discussion of aggregates of particles. 
In particular we shall use it in connection with the kinetic theory of 
gases which will be reviewed in the next chapter. 

It will be of interest to apply eq. (86) to the derivation of the 
entropy of an ideal gas. From the definition (69') we have 

5 = * log w - kN log N + kN. (87) 

But 

k log w = E/T + Nk log Z, 

where E is the total energy of the gas, which from (86) becomes 
E = ZNkT/2 (with replaced by kT, as before). The use of (83) 
coupled with substitution into (87) yields finally for the entropy of 
the ideal gas 

(88) 



It follows that the entropy is an extensive quantity in the same sense 



SEC. 5] PROBLEMS 69 

as the free energy, as discussed in Sec. 4. The reader should note 
that this result would not have been obtained if we had defined the 
entropy simply as S = k log w. It is the choice of the additive con- 
stant * log Nl which assures that S shall have the extensive prop- 
erty. 4 

PROBLEMS 

1. Derive the distribution law Nj = Ngj (eq. 7 of this chapter) by varying the 
NJ subject to SW/ = 1, while keeping the gj fixed. 

2. Find the expression for the free energy of the set of independent linear simple 
harmonic oscillators whose possible energy values are given by Ej - (j ' %)hv, 
(j = 1,2,3 ) (cf. eq. 37). Do the same for a set of two-dimensional oscillators 
for which the average energy is given by (47). 

3. From the partition function Z for a set of independent linear simple harmonic 
oscillators derive the expression for the entropy. 

4. Show that if the entropy is denned as 5 = k log TV, it is not an extensive quan- 
tity for any ideal gas and that the same is true of the corresponding free energy. 

5. Derive the expression for the Gibbs free energy, ty -f p V, for an ideal gas. 
Do the same for the enthalpy (cf. Sec. 3, Chapter III). 

6. In a canonical distribution of N free particles find the expression for the 
number of particles having kinetic energy included between E and E + dE. Find 
the ratio between the number of particles with energy in the range dE about the 
average energy E = 30/2 and the number of particles with energy in the same range 
dE about an energy differing by 1 per cent from E. 

7. Compute the " effective statistical probability " for the canonical distribution 
of one mole of an ideal gas under standard conditions. 

8. The methods of this chapter may be applied to an aggregate of independent 
particles which move in an external conservative force field characterized by the 
potential function 0(jc, y, z). Find the general form of the partition function of such 
an aggregate. Specialize to the case in which the force field is the constant gravi- 
tational field near the surface of the earth. 

4 This point is well brought out in Mayer and Mayer, "Statistical Mechanics," 
pp. 114 ff. John Wiley & Sons, 1940. 



CHAPTER V 
THE KINETIC THEORY OF GASES 

1. THE VIRIAL AND THE EQUATION OF STATE OF AN IDEAL GAS 

It is assumed that the reader is acquainted with the fundamental 
assumptions of the kinetic theory of gases from at least an elementary 
point of view. It is the intention of this chapter to review briefly 
its chief results, emphasizing in particular the points of contact with 
the statistical theory of the preceding chapter. 

The kinetic theory envisages a gas as composed of a very large 
number of material particles called molecules moving with widely 
varying velocities in all directions and colliding with each other and 
with the walls of the confining vessel. The collisions with the walls 
are assumed to be responsible for the pressure of the gas while the 
average kinetic energy of the molecules is connected with the observed 
temperature of the gas. The first fundamental task of the kinetic 
theory is to provide a theoretical deduction of the equation of state 
of the gas, i.e., the relation connecting pressure, volume and tempera- 
ture (cf. Sec. 3, Chapter 3). There are several ways of attacking 
this problem. 1 The one presented here is somewhat different from 
that used in elementary books, being based on the so-called virial of 
Clausius. 

We shall confine our attention first to an ideal gas, in which the 
molecules are mass particles in the form of elastic spheres with radii 
very small compared with their average distance apart, so that indeed 
their dimensions can be neglected. They are assumed to be free 
particles exerting no mutual forces and all having the same mass m. 
We shall suppose their number is N and number their coordinates 
%i, yi, %i in some inertial system with the subscript i which runs from 
1 to N. Consider the following function of the coordinates 

N 

(*; +?? + *;) (1) 



1 Cf. Sir James Jeans, "Dynamical Theory of Gases," Chapter VI. Cambridge, 
1930. The reader may also consult with profit other recent books on kinetic 
theory, e.g., L. B. Loeb, "Kinetic Theory of Gases," Chapters II and V, McGraw-Hill, 
New York, 1927; and E. H. Kennard, " Kinetic Theory of Gases," Chapters I and V, 
McGraw-Hill, New York, 1938. 
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This does not appear to have immediate physical significance. But 
let us differentiate with respect to the time. Using the dot notation 
for time differentiation, we obtain 

N 

+ yfr + Zi Zi). (2) 

A second time differentiation gives 

N N 

M - 2^m(x 2 i + y1 + i?) + n(x&i + yfa + *&). (3) 



t=i 



M 



We shall now proceed to form an average of a somewhat different 
nature from any hitherto considered in this book, namely an average 
over the time. The coordinates, component velocities and component 
accelerations of the particles are of course functions of the time. 
Hence the quantity M will depend on the time. Its time average is 

i r r .. i . 

= - / Mdt = ~(M T - Mo), (4) 

T Jo 

where MQ is the value of M at the initial time t = and M T its value 
at time r. Since the system of particles is confined to a finite space, 
the Xi, yi, Zi coordinates are bounded at all times. The component 
velocities #;, y lt zi will also be bounded. Consequently the difference 
M T MQ in (4) is bounded and therefore as r increases M T 0. From 
this we conclude that 



*&), (5) 



where the averages are taken over a sufficiently long time. The 
quantity on the left-hand side is the time average of the total kinetic 
energy of the system. The quantity on the right-hand side may be 
written in slightly different form if we replace mxi by F Xi , the x com- 
ponent of the resultant force acting on the ^'th particle. Similar 
replacement of the y and z component accelerations yields for the 
right-hand side _ 

N 



The quantity 12 was called by Clausius the virial of the system. The 
relation (5) then states that for a system in canonical distribution the 
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total average kinetic energy is equal to its virial. This is the virial 
theorem. 

Let us now calculate the virial for an ideal gas by considering that 
the force components which enter eq. (6) are merely those involved 
in the collisions between the molecules and the walls. This means 
that F Xi , etc., have non-vanishing values only for the values of #-, y-, z 
at the walls. These forces arise because of the change in momentum 
experienced by the molecules in their reflection from the walls. It 

might seem difficult to compute 
them; but we must recall that 
it is the time average which is 
involved in eq. (6). The average 
wall forces can be effectively re- 
placed by the integrated pressure 
which according to the postulates 
of the kinetic theory is indeed 
the average effect of the continual 
bombardment of the walls by the 
molecules. Figure 5 1 represents 

FIG. 5-1. a portion of the surface of the 

containing vessel. Let us con- 
sider the area element ndS in this surface, where n is the unit vector 
normal to the surface at dS. This element has the position vector 




r = ix + jy + Viz 



(7) 



in the chosen system of rectangular coordinates. The average force 
exerted on the system by the element of area is a vector directed oppo- 
sitely to n and of magnitude pdS, if we denote the pressure of the gas 
by p. The components of this force along the axes are then 

pdS cos a, pdS cos /?, pdS cos 7, 

where cos a, cos /3, cos 7 are the direction cosines of n. The summa- 
tion in (6) over all the molecules is now conveniently replaced by an 
integral over the whole surface of the containing vessel. Conse- 
quently we have for the virial 



(x cos a + y cos + z cos y)dS. 



(8) 



Since by definition 



n = i cos a + j cos ft + k cos 7, 
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we can transform (8) to 

Jh r 

ndS. (9) 



We can now employ the divergence theorem (often called Gauss' theo- 
rem) in vector analysis 2 to write 



-f/ 



V-rdV, (10) 

where the integral is now the volume integral of the divergence of r 
over the whole volume of the containing vessel. But 

dx dy dz 
V - t = T X + Ty + te = *- 
Hence finally 

= $PV. (11) 

From the virial theorem we therefore draw the conclusion that the 
total average kinetic energy of the molecules of the ideal gas is equal 
to 3pV/2, or . <; 






Having reached this point we cannot go further toward the actual 
equation of state without introducing a macroscopic interpretation 

JL 
of E. There are two possibilities. In the first place, we may arbi- 

trarily but plausibly assume that the total average energy character- 

izes the temperature of the gas, so that constant temperature implies 

t 

constant E. With this association (12) becomes at once Boyle's 
law for an ideal gas. If we wish to be more specific we can introduce 
the statistical considerations of the preceding chapter and assume 
that the molecules are canonically distributed with respect to their 
kinetic energy. To be sure this is passing beyond the methods of 
kinetic theory as commonly understood, but it forms a useful bridge 

between the kinetic and statistical points of view. We shall then 

JL _ 

suppose that the total average kinetic energy E is the same as NE, 

where N is the number of molecules and E the average energy per 
molecule in a canonical distribution. But in Sec. 5 of Chapter IV we 
have shown that 

NE = 



2 See Leigh Page, " Introduction to Theoretical Physics/' second edition, p. 32 
D. Van Nostrand Co., New York, 1935. 
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Hence (12) becomes 



(13) 
But the equation of state of an ideal gas is known to be 

PV = RT, (14) 

where T is the temperature on the Kelvin scale and R is the gas con- 
stant appropriate to the gas in question and its mass. If (13) is to 
be the kinetic-statistical analogue of (14) it is clear that we must have 

- | T. (15) 

In words, the modulus of the canonical distribution is directly pro- 
portional to the Kelvin temperature of the gas and the coefficient of 
proportionality is the gas constant per molecule. This is the quantity 
which has received the name of Boltzmann's gas constant k, i.e., 

* = f, (16) 

and 

= kT. (17) 

With this assignment the kinetic statistical derivation of the equation 
of state of an ideal gas may be considered complete. 

2. SOME SIMPLE KINETIC THEORY PROPERTIES OF AN IDEAL GAS 

The simple kinetic theory described in the preceding section leads 
to some interesting properties of an ideal gas. We shall review these 
briefly here. 

If we write the kinetic energy in the form 

N N 



S m 
-r 



= i t = 

it follows that we can express the average in terms of a root-mean- 
square velocity 

Vm = ^ (18) 

as follows 

" Nm o 

= . (19) 

Equation (12) then becomes 

PV = iMm/L (20) 
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The value of v m can be estimated very easily from the fact that the 
density of the gas is given by 

Nm 

p = ~Y> (21) 

whence 

/3 

(22) 

The substitution of the atmospheric pressure in dynes/square centi- 
meter and the density of hydrogen under standard conditions (7" = 
273 K, pressure 76 cm of Hg) yields v m = 1,900 meters/sec approxi- 
mately for hydrogen, in so far as hydrogen can be considered an ideal 
gas. Actually v m for an ideal gas does not depend on the pressure. 
From (14) we readily write 






(23) 



indicating that for a given gas v m depends solely on the absolute tem- 
perature. Precise knowledge of the mass of a single molecule, e.g., 
that of hydrogen, then suffices for the determination of k. Using the 
value w H2 = 3.32 X 10~ 24 gram leads to 

k = 1.37 X 10~ 16 ergs/degree C. (24) 

Coming back to the ideal gas law in the form 

pV = NkT, (25) 

we observe that since k is a universal constant, N is the same for 
equal volumes of all ideal gases at the same temperature and pressure. 
This is the law of Avogadro. In particular since the chemical molec- 
ular weights are proportional to the actual molecular masses, it 
follows that the number of molecules in a gram molecule or mole of 
any ideal gas is a universal constant. This is the Avogadro number, 
now generally given as 

N A = 6.03 X 10 23 , (26) 

whose best evaluation is from electrochemical data, namely the value 
of the Faraday Q or charge necessary to evolve one chemical equiva- 
lent of any element in an electrolytic process coupled with the funda- 
mental electric charge e. We have indeed 

Q = N A c, 
and with Q = 9,650 emu and e = 1.6 X 10~~ 20 emu, A^/fs found to 



76 THE KINETIC THEORY OF GASES [CH. V 

be 6.03 X 10 23 . The Avogadro number is not, of course, confined to 
an ideal gas. According to atomic theory it is a genuine universal 
constant giving the number of molecules in a mole of any element or 
compound. 

No difficulty should be experienced in deducing from the above 
simple considerations other important results of elementary kinetic 
theory, including Graham's effusion law, according to which the 
effusion velocity through small orifices of a gas at constant tempera- 
ture and pressure varies inversely as the square root of the density, 
and Dal ton's law of partial pressures that the pressure of a mixture 
of two gases is equal to the sum of the pressures which each would 
exert individually if alone in the same volume. 

The kinetic theory also has something of interest to say about 
the specific heats of a gas (Sec. 4, Chapter III). The total average 
energy per gram of an ideal gas whose molecules are single mass par- 
ticles is from (19) 

*-H*r 

Consequently the specific heat at constant volume is 

>*. (28) 

2m 
Multiplying both numerator and denominator of (28) by NA gives 

f 3 * NA - 3 P on 

CV 2~M~2 Rm ' (29) 

where M is the molecular weight and R m the gas constant per gram of 
gas. For the rare gas argon, for example, R m = 2.1 X 10 6 ergs /gram 
degree and hence (29) yields (in terms of the more familiar calories/ 
gram degree C) 

cy = 0.075 cal/gram degree C. 

This agrees very closely with the experimentally observed value at 
room temperature. 

For a gas whose molecules consist of more than one mass particle, 
we can no longer assume that the average energy per gram is given by 
(27), since the internal kinetic energy of the constituent parts of the 
molecule relative to its center of mass must be considered. We may 
account for this by writing in place of (27) 

J^(l+i9) f (30) 
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where ft represents the ratio of the average internal kinetic energy to 
the average kinetic energy of translation of the center of mass of the 
molecule. It will have the same value for all gases whose molecules 
have the same constitution. We then get 

CF = f(l+/S)-. (31) 

2 m 

For example, if a molecule consists of two particles of equal mass 
rotating about some axis perpendicular to the line joining them, it 
develops that ft = % so that 

c v = \-. (32) 

2 m 

This choice of ft is dictated by the equipartition principle. There are 
three degrees of freedom of translation of the center of mass of the 
system of two particles whereas there are two degrees of freedom of 
rotation, namely about two mutually perpendicular independent axes 
also perpendicular to the line joining the two particles. Now if the 
equipartition principle applies also to rotational energy we should 
expect that at given temperature the ratio of the average kinetic 
energy of rotation to that of translation would be precisely %. This 
is the basis of the choice in (32). 8 This formula holds well for hydro- 
gen. The substitution of m H2 and k into (32) gives indeed 

cy = 2.46 cal/gram degree C 

in fairly close agreement with the measured value for hydrogen at 
room temperature. 

The ratio of the specific heat at constant pressure to that at con- 
stant volume, 7 = C P /CV can also be handled by the application of 
the preceding considerations to eq. (25) of Chapter III. The latter 
can be rewritten in terms of the actual specific heats (instead of the 
heat capacities) as follows 

c p - c v = R m , (33) 

it being understood that comparable units are employed on both sides. 
For monatomic gases we therefore have from eq. (29) 

f - 7 = f (34) 

cy 3 

This ratio is found to hold pretty exactly for the rare gases of the 

8 See, for example, Kennard, op. cit., p. 365, for a proof that the equipartition 
principle holds equally well for rotational as for translational energy. 
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atmosphere. For diatomic gases like hydrogen and oxygen, on the 
other hand, we have from (32) 

1 = i (35) 

in rather good agreement with experimental results for the quasi-ideal 
diatomic gases. As the complexity of constitution of the molecule 
increases, cy increases and we expect 7 to approach unity. This again 
is experimentally confirmed to a considerable extent. For the statis- 
tical theory of specific heats, cf , Chapter IX. 

3. COLLISIONS AND MEAN FREE PATH OF A GAS 

In our discussion so far we have entirely neglected the possibility 
that the molecules of a gas may collide with each other. We have 
indeed assumed effectively that the molecules are geometrical points. 
If they are actually possessed of finite extension, however, they will 
be bound to hit each other in their flight and such collisions conceiv- 
ably should have an important bearing on the properties of a gas. 

For the elementary considerations of the present section we shall 
assume that the molecules are perfectly elastic spheres of diameter D. 
Suppose that all the molecules save one are instantaneously at rest, 
whereas that one moves with respect to the others with average rela- 
tive velocity z. Denote the average number of collisions per second 
experienced by any molecule by Z c . The distance between successive 
collisions is called the free path of the molecule and the average value 
of a large number of free paths is termed the mean free path. We 
shall denote it by X. Clearly, if the average velocity of the molecules 
is given by v mj we have the fundamental relation 

X = "f- (36) 

^C 

Naturally the precise value of X for given Z c depends on the value 
chosen for the average velocity. The root-mean-square velocity is the 
one usually adopted. 

The average number of collisions per second for any spherical mole- 
cule of diameter D is the same as the average number of collisions per 
second for a single spherical molecule of radius D moving through a 
field of point molecules. Consequently we can get an approximate 
value for Z c by multiplying the average number of molecules per unit 
volume by the volume of the right circular cylinder traced out by a 
circle of radius D moving with the average relative velocity W T . Thus 

, (37) 
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where n is here the number of molecules per unit volume. In order to 
utilize (37) it is necessary to evaluate v r in terms of v m . This can be 
readily done in the simple special case for which all the molecules have 
the same velocity magnitude, viz., v m . Consider the two molecules 



FIG. 5-2. 



whose paths make the angle with each other (cf. Fig. 5-2). The 
relative velocity is clearly 



e 



v r = 2v m sin - 

2* 



Hence 



. e 

= 2v m sm - , 



(38) 
(39) 



and we must calculate the average of sin (0/2) over the whole collection 
of molecules. Now if we fix our attention on a single molecule the 
probability of finding another molecule with its velocity included in 
the angular region dO between 6 and + dO from the direction of the 
first molecule is simply 

sin dd. (40) 

This may be seen from the fact that we are assuming the velocity 
directions of the molecules to be symmetrically distributed so that no 
particular direction has greater a priori probability than any other. 
Hence the fractional number of molecules having their velocity direc- 
tions at any instant included within the range and 6 + dd with any 
arbitrary direction is the ratio of the solid angle included between 
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6 and 6 + dB to the whole solid angle about a point, viz., 4ir. This 
ratio is readily seen, however, to be 

. dO 
2ir sin , 
4?r 

or the value given in (40). Hence, from the general definition of an 
average quantity 



sin - = - / sin - sin d6. (41) 

2 2 /o 2 

Note that the integration over 6 is taken between the limits and TT. 
We see indeed that 



/ si 

*/o 



sinedB = 1. (42) 

The evaluation of the integral in (41) leads at once to 



1 = 2 

and hence from (39) 

V = |fm. (43) 

Consequently (37) becomes 

Z c = %7rD 2 nv m , (44) 

and the mean free path is 



This result was first obtained by Clausius. 4 The mean free path is 
thus seen to depend on the number of molecules per unit volume and 
the diameter of each. The former quantity n can readily be obtained 
from the Avogadro number and the fact that the volume of a mole of 
any ideal gas is equal to 22.41 X 10 3 cm 3 under standard conditions of 
temperature and pressure. Hence approximately at C and 76 cm 
of Hg 

n = 2.70 X 10 19 . (46) 

How shall we get a measure of D? Many methods are available, 5 but 
we shall consider only one. This has to do with the viscosity of a gas, 
in itself an interesting topic. 

4 Cf. Kennard, op. cit., p. 105. 

6 Cf. Loeb, op. cit. Appendix I, pp. 523 ff. 
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4. ELEMENTARY KINETIC THEORY OF VISCOSITY 

The viscosity of a gas is one of the most significant effects of the 
molecular motion within the gas on the large scale motion of the gas 
as a whole. It will be recalled that in all actual fluids the motion of 
any part of the fluid is to a certain extent resisted by the rest. This 
effect is attributed to a so-called viscous force which each layer of 
fluid exerts on the immediately adjacent layer. Newton made the 
assumption that the viscous force is directly proportional to the flow 
velocity gradient or rate of change of velocity with distance normal 
to the direction of flow. Moreover, it is also assumed that the viscous 
force is proportional to the area of the contiguous layers. The con- 
stant of proportionality is called the coefficient of viscosity, or more 
briefly, the viscosity of the fluid. If we denote the area in question 
by A, the flow velocity gradient by dV/ds, and the viscosity by 77, 
the viscous force then may be written 

F=iA?f. (47) 

ds 

The viscosity is thus the viscous force per unit area per unit velocity 
gradient. In absolute units its dimensions are dyne second/square 
centimeter. In these units the value for water at 20 C is 0.01. As 
might be expected, the viscosity values for gases are very much 
smaller. That for hydrogen at C is 8.4 X 10~ 5 in absolute units. 
The viscosity of liquids decreases as the temperature increases, but 
that of gases increases with the temperature. 

The explanation of the viscosity of a liquid is probably to be 
found in considerable measure in the cohesive forces between the 
constituent parts. This explanation is not available for ideal gases 
in which such forces are ignored. Yet even a gas like hydrogen, 
which at ordinary temperatures approaches close to the ideal variety 
in its other properties, possesses definitely measurable viscosity. 
Maxwell was the first to give a kinetic theory description of the vis- 
cosity of a gas in terms of the motion of the molecules and in particular 
the transfer of momentum by the random motion of the molecules 
from one moving layer of gas to another. An elementary discussion 
based on this idea follows. 

Consider a gas which is flowing as a whole from left to right. 
Draw the parallel horizontal planes A, P, and B (Fig. 5*3) which 
contain the direction of flow, and suppose the flow velocity of the 
gas in plane A is Vi, while that in plane B is ^2, where A and B are 
chosen a distance apart equal to 2X. The plane P is equidistant from 



82 



THE KINETIC THEORY OF GASES 



[Cn. V 



A and B. If we make the assumption that all the molecules have 
the speed v m , the average number of molecules traveling downward 
across unit area of P per second is nv m /6 while on the average the 
same number of molecules travel upward across unit area of P per 
second. Because of the way the planes have been drawn, the mole- 
cules mentioned have suffered their last collisions (before striking P) 
in the planes A or B. If we assume that in passing through A or B 
each one instantaneously acquires the appropriate flow velocity Vi 
or V 2 , it follows that the nv m /6 downward-moving molecules convey 
from A to the gas below P the flow momentum nv m /6 m V\ per second 



/ 


/ 




X 


/* 


i / 



E 



FIG. 5-3. 

per unit area while the nv m /6 upward-moving molecules convey from 
B to the gas above P the flow momentum nv m /6 - m V 2 per second per 
unit area. Now from Newton's second law the downward transfer 
of momentum per second per unit area, viz., nv m /6-mVi represents 
the tangential stress exerted by the gas above the plane P on the gas 
below, while the upward transfer of momentum per second per unit 
area, viz., nv m /6-mV 2 represents the tangential stress exerted by the 
gas below the plane P on the gas above. The equal and opposite 
reaction (Newton's third law) on the gas below is therefore 
nv m /6-mV 2 . Hence the resultant tangential drag on unit area of 
the gas immediately below P is given by 

By definition this corresponds to F/A in eq. (47). Moreover the 
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velocity gradient in the neighborhood of P is clearly to a good 
approximation 

dV _ (Vi - F 2 ) 

ds 2X 

Hence the viscosity r\ becomes at once 

_ nmv m \ _ P v m \ 

V - i - ~~T~ > (48; 

o j 

where p is the density of the gas. This fundamental relation has been 
derived here by a rather crude method making use of somewhat ques- 
tionable assumptions. More careful and elaborate deductions are to 
be found in the standard kinetic theory textbooks. 6 By substituting 
for v m from (23) and X from (45), the expression for rj becomes 



* = -^3-- (^ 



This brings out the interesting fact that the viscosity of a gas should 
be independent of the pressure and therefore of the density. This 
prediction was first made by Maxwell who also verified it experi- 
mentally over a wide range of values. The variation with the square 
root of the absolute temperature has also received ample experimental 
verification. For a description of experimental methods of measuring 
77, Loeb's book may be consulted. Equations (48-49) may now be 
used to obtain numerical estimates of X, D, and Z c . Thus for hydrogen 
under standard conditions we obtain 

X = 1.7 X 10~ 5 cm. 
D ~ 10~ 8 cm. 
Z c ~10 10 sec" 1 . 

5. THE MAXWELLIAN DISTRIBUTION OF VELOCITIES 

Although at the very beginning of this review of kinetic theory 
we included among the fundamental ideas the assumption that the 
molecules may differ in velocity, in the applications so far considered, 
particularly those in Sec. 3, we have actually assumed that effectively 
all the molecules move with the root-mean-square velocity v m . It 
now remains to be seen how the variation in velocity may be taken 

6 Cf. Kennard, op. cit., pp. 138 ff.; Loeb, op. cit., pp. 180 ff. Also cf. Page, op, 
cit., p. 343. 
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into account. We must first consider how the molecules are dis- 
tributed on the average with respect to their velocities. This is the 
main problem of kinetic theory and many deductions have been given, 
the most famous being those of Maxwell and Boltzmann. From the 
purely dynamical point of view the problem is to consider an aggre- 
gate of confined elastic spheres which are continually colliding with 
each other, with the velocity of each sphere usually changing at each 
collision. However at any given instant there will be a certain number 
of spheres having velocities lying within any initially prescribed 
velocity interval. It is a fundamental assumption of kinetic theory 
that this number ultimately reaches a value which does not vary 
with the time as long as the gas represented by the aggregate remains 
in a state of equilibrium. The problem is to find this number. Boltz- 
mann attacked it from the standpoint of the average effect of the 
elastic impacts on the velocities of the individual particles. Maxwell 
disregarded impacts altogether and considered the distribution of 
velocities from the standpoint of pure probabilities. There has been 
much argument about these deductions. We shall not go into either 
but merely show how the distribution they arrived at more or less 
immediately emerges from the application of the classical statistics 
of Chapter IV to an aggregate of particles, with the assumption that 
the state of equilibrium is represented completely by a canonical 
distribution with respect to the energy of the particles. The distribu- 
tion of velocities is then obtained at once from the canonical distri- 
bution, eq. (27) or eq. (27') of Chapter IV with Ej as the kinetic 
energy of a particle of mass m and velocity components v x ^ v y ^ v z -. 
In accordance with eq. (17) of this chapter we set @ = kT. For the 
partition function S/xg,-e~"^ /e we use the expression for Z given in 
eq. (83) of Chapter IV. The a priori probability gj is that given in (79) 
of Chapter IV. With these substitutions, the number of molecules 
having their velocity components in the interval v xj v x + bv x \ 
Vy> Vy + &Vyi v zi v z + & v z may be written 

(\ 8 /2 
_^_j e- m/2kT ^ + l+%Av x to y to,. (50) 

This is the Maxwellian velocity distribution formula. If we denote 
(w/27rr)' /2 e - m * /2kT (with v 2 = vl + vl + vl) by /(), we have 

AN 

= f(v) bv x bvykv z . (51) 

The function f(v) is the well-known probability or Gauss error func- 
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tion already introduced in Chapter II (cf. Fig. 2*1). The fractional 
number of molecules per unit velocity interval in the neighborhood 
of v is a symmetrical function of the resultant velocity, with the 
result that the average velocity is zero. 

Often a more useful type of distribution law than (51) is that for 
the fractional number of molecules having resultant velocity magni- 
tude in the interval from v to v + Au. This distribution thus disre- 
gards the direction of the ve- 
locity. If we imagine a momen- 
tum space in which each point 
represents a possible resultant 
momentum value for a mole- 
cule, the points corresponding 
to molecules with resultant 
momentum magnitudes between 

mv and mv + mhv will lie in FlG 5 . 4 

a spherical shell of radius mv 

and thickness wAz;, with volume 4irm 3 v 2 Av. Denoting by (AA/)' the 
number of molecules corresponding to this volume gives 



(AAQ' 

AN 



(52) 



By substitution for A7V from (50) the desired distribution formula 
then proves to be 



N 



m 



(53) 



The plot of (f>(v) yields the non-symmetrical curve shown in Fig. 5-4. 

We can use the non -symmetrical Maxwell distribution to calculate 

some average values of the velocity. First we shall compute the 

velocity corresponding to ma x- This will be denoted by v* and is 

obtained by solving = 0. We get 



dv 




(54) 



Next, the average of the absolute value of v is given by 

/oo 
v<t>(v)dv. 
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On employing partial integration there results 

' ' * / * 

We have already introduced the root-mean-square velocity as the most 
important average velocity in kinetic theory. The evaluation of v 2 
from (53) yields 

C* IkT 

v 2 = v 2 <t>(v)dv = , (56) 

XQ Wl 

with 

IlkT 

w-V ( 57 > 

^ m 

in agreement with the result of the simple theory at the beginning of 
this chapter. We have indicated in Fig. 5-4 the fact that 



In practically all subsequent applications we shall employ v m . 

The average kinetic energy can also be found from (56) with the 
usual result, i.e., 

to = f kT. (58) 

Many attempts have been made to verify the formula (53) experi- 
mentally,^ One of the more recent ones is that of Zartman, 7 who 
studied the deposition of evaporated bismuth atoms on the inside of 
a revolving cylinder and obtained a 'Velocity spectrum" in good 
agreement with the Maxwell law. 8 

6. APPLICATION OF THE MAXWELL DISTRIBUTION TO COLLISIONS 

The discussion of collisions and mean free path given in Sec. 3 
can now be generalized to include the assumption of the Maxwell 
distribution. Let us consider the collisions between the set of mole- 
cules having velocity components in the interval v xt v x + dv x , etc., to 
be denoted as set 1 for convenience and the set having velocity com- 
ponents in the interval V X9 V x + dV x , etc., to be denoted as set 2. 
The relative velocity of a molecule in set 1 with respect to a molecule 
in set 2 has the magnitude 



ife) 2 + (V y - v y ) 2 + (V z - v z }\ (59) 

iPhys. Rev. 37, 383 (1931). 

8 For a discussion of these and other similar experiments, the reader is referred 
to Kennard, loc. cit. t pp. 71 f. See also "Atomic Physics," Univ. of Pittsburgh 
Staff, second edition, p. 11, 1937. 
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The average number of collisions per second which any single molecule 
of set 2 makes is (from Eq. 37) 



+ 00 

* 



(60) 



where the average is taken over all the molecules in unit volume by 
allowing v xt v y , v z to take on all possible values. Evidently Z' c is a 
function of V x , V y , V z . What we wish is the average number of col- 
lisions suffered by a molecule of any velocity. To get this from (60) 
we must multiply by the probability of finding a molecule of set 2 
and then integrate once more over the velocities. Thus the desired 
collision rate is finally 

+ 00 




dv x dvydv z dV x dVydV z . (61) 
To evaluate the integral (61) we introduce the following transformation 



a i a 

= a - -; v x = a + - 



c c 

V z = -Y --; v, = 7 + - 



(62) 



Then 

v r = Va 2 

and substitution into (61) yields 




where we have used the fact that 

d V x d V,d Vzdvxdvydv, = d ( Vx < V >" V *> v *' v < v *) dadpdydadbdc. (64) 
r ^ 
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dV x dV y 



dv z 



da da da da da da 



dV x 



dV z 



dc dc dc dc dc dc 



(65) 



In the present case this reduces to unity, leading to (63). To evaluate 
(63) we note first that 



There remains the triple integration over a, b, c. We transform to 
spherical coordinates, i.e., 



whence 

and 

This yields 



a = w sin 6 cos 0, b = w sin 6 sin <, c = w cos 6, 
a 2 + b 2 + c 2 = w 2 , 



dadbdc = w 2 sin B dO d<f> dw. 



, -f 00 



fff( )dadbdc = f* f f w 3 e~ <*/r> sin & d$ ^ dw 

s *s *s */0 *^0 *^0 

00 



and therefore the final result is 



m 



(66) 



This value of Z c should be compared with that in eq. (44) which we 
obtained on the assumption that all the molecules move with the 
velocity v m . Since 7r/3 is very close to unity, the difference between 
the two formulas is slight, amounting in fact to only a little over 
2 per cent. While it is decidedly unsafe to generalize on the basis of 
a single illustration, this result does suggest that the invocation of 
the Maxwell distribution does not alter the general functional form 
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of simple kinetic theory formulas but may be expected merely to 
introduce a slightly altered numerical multiplying factor. This pre- 
sumption is realized to some extent by more extensive study, though 
the change in the factor may indeed be larger than in the case just 
discussed. 

The value of the mean free path obtained from (66) turns out to be 



X = , (67) 

4D 2 n 

differing very slightly from our more simple version in eq. (45). 

7. TRANSPORT PHENOMENA 

The phenomenon of viscosity as described by the kinetic theory 
of gases is an illustration of a group of effects known under the head- 
ing of transport phenomena. Other well-known illustrations are heat 
conduction and diffusion. It is not our intention to give a detailed 
discussion of these effects which are usually extensively treated in 
the standard kinetic theory texts. 9 However, it will be worth while 
to discuss them qualitatively and give for reference some of the funda- 
mental formulas, particularly with a view to the influence of the 
Maxwell distribution on the results. 

We have already mentioned viscosity in Sec. 4 and have given 
a "derivation'* of a sort for the coefficient of viscosity of a gas. The 
application of the Maxwell law to the problem is a rather involved 
matter but finally leads to the result 

rj = 0.310p^ m X, (68) 

differing from the value in eq. (48) by the difference between % and 
0.310. Actually the problem proves to be more complicated than 
even the Maxwell distribution would make it; in fact account must 
be taken of the fluctuations which occur from the precise Maxwell 
distribution and which exert a considerable influence on the viscosity 
which tends to increase the numerical factor 10 in (68) to nearly 0.5. 
The conduction of heat through a gas, like that through a solid, 
takes place whenever a temperature gradient exists. The thermal 
conductivity K is defined by the fundamental equation 

^=- K r, (69) 

dt dx ' v ) 

9 Cf., for example, Kennard, op. cit., pp. 135 ff. 

10 This will be true only for an infinitely rare gas. This result is due to Chap- 
man. For the reference, consult Kennard, op. cit., p. 147. 
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where dQ/dt is the rate of flow of heat per unit area per second in 
the x direction and dT/dx is the temperature gradient in this direc- 
tion. From the standpoint of kinetic theory thermal conduction is 
due to the transfer of the energy of the on-the-average faster moving 
molecules from the high temperature region to the lower temperature 
region. Of course molecules from the low temperature region also 
move to the higher temperature region but they carry less energy 
and hence there is a net transfer of energy in the direction of decreas- 
ing temperature. We can even give a simple theory of thermal con- 
duction by considering the transfer of energy from one layer of a gas 
to another just as we treated the transfer of momentum in the 
simplified discussion of viscosity in Sec. 4. 

Let us refer once more to Fig. 5 3 and assume that the average 
kinetic energy per molecule in plane A is E\ while that in plane B 
is 2* By utilizing assumptions similar to those employed in Sec. 4, 
we arrive at the conclusion that the nv m /6 molecules which cross 
unit area per second moving downward from A, transfer the kinetic 
energy nv m /6-Ei, while those moving upward from B, transfer the 
amount nv m /6*E2* There is thus a net transfer of kinetic energy of 



over a distance of 2X. But we can write 

EI - 2 _ dE 

2X ~~ ~ds ' 

where dE/ds is the gradient of kinetic energy in the direction of tem- 
perature change. The rate of flow of heat energy per unit area per 
second then becomes 

dQ _ nv m \dE 

dt ~ ~~ 3 ds ' (70) 

where the negative sign emphasizes that the flow takes place along 
the negative gradient. Now we can get a connection with (69) by 
observing that 

dE_dEdT 

_ -^ 

as dT ds 
whence 

1 fj 77 

K = -nv m \ - (72) 

3 dl 
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dE . 

But n is the specific heat of the gas at constant volume per unit 
dl 

volume and divided by the density is the ordinary specific heat cy 
(eq. 28). Hence for an ideal homogeneous gas this simplified theory 
yields 

ycy. (73) 



This is an extremely interesting result, connecting as it does the 
thermal conductivity with the viscosity and specific heat at constant 
volume. All these are independently measurable quantities so that 
the relation provides a good test of the success of kinetic theory. 
Since the derivation is idealized and all intermolecular action is ignored, 
we cannot expect indeed that (73) will be satisfied exactly for any gas 
though there should be agreement in order of magnitude. A table of 
experimental values quoted by Kennard u indicates that the ratio 
K/fjcy for the rare gases helium, neon, and argon averages about 2.45. 
For hydrogen, nitrogen, and oxygen the ratio is around 2. It is inter- 
esting to note that for the gases with more complicated molecules like 
NHs, the ratio drops to approximately 1.5. It is clear that the rela- 
tion (73) has only order of magnitude validity and further study is 
necessary to establish an accurate relation. This has been done by 
Chapman (cf. Kennard, as before, for references) who found that for 
monatomic gases (73) should be replaced by K 5/2 "v\cy. This 
is in good agreement with experiment. On the other hand for poly- 
atomic gases Eucken has derived the formula K = 1/4 -(97 5)rjcy 
(with 7 = CP/CV) which is also in excellent agreement with experi- 
ment. 

Diffusion takes place in a gas composed of two or more different 
kinds of molecules where the concentrations vary from point to point. 
The result is an eventual evening out of differences of composition 
with the ultimate attainment of uniform concentration throughout 
the gas. Kinetic theory describes this process in terms of the greater 
probability of molecules of a given kind to move from a region of 
high concentration to one of low concentration than in the reverse 
direction. This, of course, tends to wipe out differences in concen- 
tration. Diffusion may be described quantitatively in terms of the 
diffusion coefficient. Suppose the gas contains two different kinds 
of molecules with concentration n\ and #2 (number of molecules per 
unit volume). These are functions of both space and time. An 
analysis of the problem, into which we shall not go, shows that for 

11 Op. tit., p. 180. 
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an ideal gas the variation of n\ and n 2 with / and x,y,z is expressible by 
the fundamental differential equations 



(74) 



In these equations D is called the diffusion coefficient for the mixture 
of the two gases. It has the dimensions of square centimeters/second. 
The reader will note the interesting resemblance between eqs. (74) and 
the ordinary differential equation for heat conduction, obtained from 
the definition eq. (69). Thus if T is the absolute temperature 

^ - . V 2 r. (75) 

dt pc v 

A simplified theory of diffusion due to Meyer leads to the follow- 
ing equation connecting D with the mean free path 





3 \ HI + H 2 



2 \ 


/ 



where the subscript 1 refers to the molecules of the first kind and 2 
to those of the second kind. It must be noted that the mean free 
paths in this formula are those given in terms of the root-mean-square 
velocity. The formula (76) takes no account of the Maxwell distri- 
bution nor of the more recent Chapman theoretical considerations. 
Kennard's book should be consulted for their effect. 

8. THE BOLTZMANN //-THEOREM 

In Sec. 5 we introduced the Maxwell distribution of velocities 
of the molecules of a gas as a kinetic representation of the canonical 
distribution of particles in an aggregate with respect to their kinetic 
energy. This was based on purely statistical considerations in accord- 
ance with which the canonical distribution and hence the Maxwell 
distribution is the most likely one because there are more ways 
in which it can be realized than any other. As such we took it to 
correspond to the state of equilibrium of the aggregate. From the 
standpoint of pure kinetic theory this might appear to be a somewhat 
gratuitous assumption. When we realize that the molecules are in 
the continual process of changing their velocities by collision, how 
can we be sure that any steady distribution of velocities will ever be 
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obtained, or that if such does occur the distribution may not actually 
be different from the Maxwellian? It is worth pointing out that 
these questions do not arise in a strictly statistical theory. Never- 
theless they are fundamental in kinetic theory. They were answered by 
Boltzmann, 12 and the answer is included in his celebrated //-theorem. 
Boltzmann investigated the quantity 



H= flogfdk, (77) 

<v/ 

where 

/ = f(v x ,v yi v z fi (78) 

is the velocity distribution function, e.g., the Maxwell f(v) in our 
eq. (51) (divided by the constant (m/2irkT)* /2 to make it non-dimen- 
sional) and dk = dv x dv y dv z . The integral in (77) is a triple one with 
the limits from <x> to + oo . We have indicated in (78) that / may 
be a function of the time; it is not, of course, in the Maxwellian case. 
Boltzmann proceeded to form dH/dt from (77) in terms of df/dt. 
The latter quantity can be expressed in terms of / itself by means of 
a rather general and somewhat elaborate study of the effect of col- 
lisions on the distribution function. We shall not go into this, but 
merely state that without specification of the precise form of /, it 
develops that 

?^' (79) 

In words, the function /I cannot increase with the time. In particular 
for a steady distribution dll/dt must vanish. But from the form of 
dH/dt, its vanishing implies necessarily that / have the Maxwellian 
form. The Maxwellian distribution then is the only steady one, i.e., 
the only one which does not change with the time. This is the essen- 
tial content of the H-theorem. However, one can go on to show that 
// has a minimum value for the Maxwell distribution as compared 
with all others having the same v m , i.e., the same average energy. 
Hence we see that the effect of the collisions is to make / ultimately 
assume the Maxwellian form if at any initial instant it does not 
possess it. One could even get an idea of the rapidity with which 
the Maxwell distribution is approached by computing dH/dt. For 
ideal gases under ordinary conditions of temperature and pressure 
this rate is very rapid indeed. It can be shown 13 that if the initial 
distribution is non-Maxwellian and the mean square velocity com- 

12 K. Akad. Wiss. (Wien) Sitzungsberichte 66, 275 (1872). 

13 Cf. Jeans, op. cit., p. 243. 
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ponents in the three coordinate directions differ, the time in which 
the difference between any two such components becomes l/e of its 
initial difference is of the order of 

r=^, (80) 

where t\ is as usual the viscosity and p the pressure. This time was 
called by Maxwell the " relaxation time." Thus for hydrogen at C 
and standard conditions 

T ~ 8.4 X 10~ n sec. 

It will be noticed that this is comparable with the time taken by a 
hydrogen molecule under these conditions to traverse a free path 
equal to the mean free path. It gives a good idea of how quickly 
any deviations from the Maxwell distribution may be expected to 
disappear. 

In our later study of statistical mechanics we shall find an inter- 
esting connection between the // function of Boltzmann and the 
entropy. The theorem (79) will be found indeed to have a close re- 
lation with the law of increasing entropy of an isolated system. 

9. EQUATION OF STATE OF A REAL GAS. VIRIAL FOR INTERACTION 

FORCES 

The treatment of kinetic theory in this chapter has been limited 
to the ideal gas in which forces between the individual molecules, 
aside from those arising from collisions, are entirely neglected. Much 
of the lack of precise agreement between the results of kinetic theory 
and the behavior of actual gases has been traced to this neglect. For 
example, the equation of state 

pV = NkT, 

derived from the simple kinetic theory in Sec. 1, describes a real gas 
accurately only when it is well above its critical temperature. Efforts 
have naturally been made to obtain a more satisfactory equation of 
state by taking into account molecular interaction. In spite of the 
years of investigation from Maxwell's time to the present, this subject 
can hardly be considered to be in satisfactory condition from a 
theoretical point of view. However, in this section we present an 
introduction to the subject, adopting the method of the virial already 
used for the ideal gas in Sec. 1. 

We shall suppose that the interaction forces in question are central, 
i.e., depend only on the distance r# from the ith to the jth particle 
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If mi is the mass of the ith particle and nij that of the j'th, it will be 
assumed that the force between the two may be expressed as 
mitnjf(rij). Hence the x component of the force may be written as 

Ffj = m l m j f(r i3 )(x i - x^/r^ (81) 

where Xi and Xj are the x coordinates of the two particles respectively. 
We must now apply this to the computation of ft in eq. (6). Con- 
sider first simply the two particles i and j. The contribution to the 
virial arising from their mutual interaction is then 

w;Wy [xi(xi Xj) + Xj(xj x^ + similar terms in y and z] , 

Zi *ij 

which becomes on reduction and combination of terms 

^mamjf(rij)rij. (82) 

There is an important point about (82) which deserves attention. 
Due to the symmetrical average distribution of the particles the 
average force on each is zero (save close to the walls of the container). 
We must be careful not to conclude, however, that therefore 

Summing over the whole aggregate of particles we have 



where the summation is extended over all pairs of particles. If now 
we combine this with the contribution to the virial due to the surface 
forces (eq. 11), the virial theorem yields 



-| NkT = ipV - i Wifw//(r iy )rtf. (83) 

ti.7 

The equation of state then takes the general form 

bifra- (84) 



If we were acquainted with the precise nature of the intermolecular 
forces, i.e., the/(r#), and could evaluate the sum involved in the second 
term on the right-hand side of (84) we should have a theoretical de- 
duction of the exact equation of state of any real gas. However, 
we do not know the intermolecular forces, except in so far as better 
information about them is being obtained by quantum mechanical 
studies which are not considered in the present purely classical treat 
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ment. Of course one might try various force laws and see how the 
resulting equations compare with the empirically known equations of 
state. The difficulty comes, however, in carrying out the evaluation 
of the summation for any force law. We can at any rate draw certain 
semi-quantitative conclusions from the general expression (84). In 
the first place we can recall the bearing of an important observation 
of Maxwell on the equation. From a certain point of view the equa- 
tion says that the pressure of a gas may be thought of as arising from 
two sources, i.e., (a) the motion of the molecules symbolized by the 
term NkTin (84) and, (6) the intermolecular forces symbolized by the 
summation term in (84). One finds it interesting to ask what would 
be the result of assuming that the pressure arises primarily from the 
forces rather than from the motions. This of course would imply 
repulsive forces between the molecules. But Maxwell observed that in 
this case Boyle's law pV = constant at constant temperature would 
demand that f(rij) should vary as l/r#, for only in this way could 
S/(riy)r# be made constant. But this inverse distance force law would 
lead to the result that the force action of the portion of the gas far 
away from a particular molecule would be greater than that near at 
hand. (Recall that the number of molecules in the neighborhood of a 
particular molecule goes up roughly as the square of the distance from 
the molecule.) This would produce differences of pressure in vessels 
of the same volume but different shape, even at the same tempera- 
ture an anomaly not observed. Maxwell therefore concluded that 
most of the reason for the pressure of a gas on the molecular theory 
must be sought in the kinetic energy term and the summation term 
must be considered a correction only. 

If we consider the intermolecular forces as cohesive in nature, i.e., 
as attractive but falling off in intensity very sharply with the distance, 
we can estimate the summation 2m t -m ; r i 7/(r z -y) rather readily as follows. 
Let us first replace m l nijf(rij) by </)(^y). Consider the interaction 
between the group of molecules in volume dV placed for convenience 
at the origin of a system of spherical coordinates and the group in 
volume dV f all at distance r from the first group. Let n as usual 
denote the number of molecules per unit volume and neglect to a first 
approximation the effect of the cohesive forces on the uniformity of 
spatial distribution. Then the contribution of the forces between the 
two groups to the sum in question becomes 

n 2 dVdV'r<t>(r). 

The total contribution for the whole gas will be obtained by inte- 
grating the above over dV and dV' and dividing by 2 to avoid count- 
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ing the volume twice. Let dV = 4?rf 2 ^r. We need not worry over 
the precise upper limit of the integration since <(r) is assumed to fall 
off very rapidly with r. Hence 

(85) 



The average required by the definition of the virial is here effectively 
implied in the procedure of taking the average number of particles per 

/ 
r^<t>(r)dr is a negative constant which we may 
v 

call a', while / dV = V, the total volume of the gas. Hence 



Equation (84) then becomes 

_ NkT 2 
P ~ V 3V 2 ' 

If further we let 2/3"jrN 2 a' = a, a constant which does depend, of 
course, on the total amount of gas, the equation of state becomes 



+ Y 2 ) V = NkT - (87) 

When the molecules get very close together it is necessary to assume 
that the attractive forces become repulsive. The calculation of the 
contribution of these repulsive forces to the virial is a more compli- 
cated matter. It is carried out by Jeans 14 and leads to the result 
that we must add to Sr0(r) the term 

3 -^~, (88) 

where b has the value 

b = f N*D\ (89) 

the molecules still being supposed to be elastic spheres with diameter 
D. The result of (88) is to modify eq. (84) still further. If b/V is 
small the resulting equation may be written 



+ -(V-b) = NkT. (90) 

u Op. cit., p. 131. Cf. also Loeb, op. cit., p. 138. 
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This is the familiar van der Waals* equation, which is fairly successful 
in describing the behavior of many real gases. It is interesting to 
note that b appears as equal to four times the total volume of all the 
molecules in the gas. In many discussions of van der Waals' equation 
it is introduced as a correction to the volume of the gas to allow for 
the volume actually occupied by the molecules. In the present dis- 
cussion it arises more properly as a contribution from the very short 
range repulsive forces which must exist between the molecules. 

10. THE BROWNIAN MOTION AND MOLECULAR REALITY 

In concluding this brief survey of kinetic theory we encounter a 
question which must often have been seriously asked in the early 
days of the theory: Are molecules real? Do they actually exist and 
can we know about their existence more directly than by the success 
of the theory as a description of thermodynamical phenomena? From 
the standpoint of modern physical methodology these questions have 
little or no meaning. Nevertheless the attempt to answer them led 
to an exhaustive study of an interesting phenomenon and shed much 
light on other kinetic properties. The phenomenon takes its name 
from the English botanist Robert Brown, who in 1827, noticed the 
irregular but ceaseless motion of small particles, e.g., gamboge, sus- 
pended in a liquid. The same phenomenon is also exhibited in striking 
fashion by smoke particles suspended in air. At first the motion was 
thought to be of organic origin but after the rise of the kinetic theory 
it became clear that the only reasonable explanation for it lies in the 
assumption that the particles are subject to the continual bombard- 
ment of the molecules of the surrounding medium. The most com- 
plete experimental study of the phenomenon is that of Perrin. 15 The 
theory has been developed by a number of investigators, including 
Einstein, Smoluchowski and Langevin. We shall present here the 
method due to Langevin. 16 

Consider the motion along the x axis of a single particle in a viscous 
medium. The equation of motion is, by an extension of Stokes 1 law 
for the fall of a sphere through a viscous fluid, 17 

mx + 6-n-nax = X. (91) 

Here m is the mass of the particle, supposed to be a sphere of radius a, 
and the viscosity of the medium is rj. The X on the right represents 

15 Cf. " Brownian Movement and Molecular Reality," trans, by F. Soddy, Taylor 
and Francis, London, 1910. 

16 P. Langevin, Comptes rendus, 146, 530 (1908). 

17 See, for example, Page, op. cit., p. 273. 
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the unknown force due to the molecular impacts. All we know about 
it is that it is positive as often as it is negative and sufficiently large 
to maintain motion. 

It will be convenient to rewrite eq (91) in terms of = x XQ, 
which is the actual displacement of the particle in time t if XQ is the 
initial distance from the origin. Then multiplying through by {, we 
obtain 

w& + 6*ria& = X%. (92) 

We can write at once 



Therefore 



If next we average over a large number of identical particles, 
becomes vanishingly small. We set z = 2 and get 



z + Sirrjaz = m 2 . (94) 

Now if the particles can be thought of as forming an ideal gas the 
principle of equipartition of kinetic energy should apply and we can 
set w 2 = kT. Hence eq. (94) becomes 



z = ~ ~ ^^ , (95) 

m m 

which on integration yields 

z = kT/Sirria + Ce-^ at/m . (96) 

Here C is a constant of integration. If the density of the material 
composing the particles is, let us say, about 1.2 grams/cm 3 ; and if 
the viscosity is that of water, i.e., approximately 0.012 gram/cm sec; 
and if a = 0.2 X 10~" 4 cm, we have 

6irrja/m ~ 10 8 sec" 1 . 

Hence as far as any steady state is concerned we may safely neglect 
the last term in (96). That equation then becomes 

fc2 _ k* 

f 

at STTTja 

yielding on integration _ _ 

+ & (97) 
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for the mean square displacement along the x axis at time r, if $ is 
the initial mean square displacement. But we can always choose 
o = and have finally 

S? = kT/STrrja-T, (98) 

which is the equation for the Brownian motion first deduced by 
Einstein, giving the mean square displacement as a linear function 
of the time. As an illustration consider a typical experiment of 
Perrin in which 50 gamboge particles all with radii very close to 
d = 0.212 X 10~ 4 cm were suspended in water at 20 C. The mean- 
square displacement component along one direction, which we may 
take to be the x axis, 18 was measured for the value r 30 sec. The 
result was ^ = 4 .5 X 10~ 7 cm 2 . Substitution into (98) yields a 
value for the Boltzmann constant 

k = 1.23 X 10- 16 ergs/degree C. 

This is of the correct order of magnitude and the result may be taken 
is a confirmation of the essential success of the theory and the value 
rf the Brownian motion for the study of molecular phenomena. 

PROBLEMS 

1. Use the virial theorem to prove that in an aggregate of particles which are 
lot confined in a box but which nevertheless have their motions bounded and attract 
;ach other with a force varying inversely as the square of the distance of separation 
the time average of the total kinetic energy is equal to minus one-half the time aver- 
ige of the total potential energy. 

2. One thousand molecules of hydrogen are in thermal equilibrium at C. 
Find the number with speeds between and 100 meters /sec; 100 and 200 meters /sec 
ind so on up to 3,000 meters/sec. Plot the results in the form of a curve and indi- 
cate on it the positions of v*, \v\ and v m . 

3. In Problem 2, find the total number of molecules whose speeds are less than v m . 
What is the probability that a molecule shall have its speed lying between v* and v m ? 

4. In Problem 2, find the total number of molecules whose component velocity 
ilong the x axis lies between 500 meters/sec and 550 meters/sec, whose component 
velocity along the y axis lies between 500 meters/sec and 450 meters/sec and 
ivhose component velocity along the z axis lies between 600 meters/sec and 650 
neters/sec. To what range of directions and resultant speed does this correspond? 

5. The mass of an electron is 9 X 10~ 28 gram. In an ideal gas composed of free 
electrons, i.e., mutual interaction neglected, calculate the root-mean-square velocity 
it C. Compare the pressure exerted by the ideal electron gas on the walls of the 

18 Strictly speaking one ought to consider the component displacements along 
;he y and z axes also. The analytical treatment is the same as above. Perrin 
>hould be consulted for details on this and other experimental points. 
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confining vessel at C with that exerted by helium gas at the same temperature, 
the number of particles per unit volume being assumed to be the same in each case. 

6. Fifty cubic centimeters of a certain mixture of oxygen and nitrogen at C 
contain 2 X 10 20 molecules of nitrogen and 5 X 10 18 molecules of oxygen. Calculate 
the partial pressure due to each gas as well as the pressure exerted by the mixture. 

7. Generalize the Maxwell distribution (50) to the case of an aggregate of inde- 
pendent particles in the constant gravitational field at the surface of the earth. 
Relate the result to the well known variation of density with height in the atmosphere. 

8. Carry out the evaluation of the Jacobian of the transformation (62) and prove 
that it is equal to unity. Find the Jacobian of the transformation from rectangular 
to spherical coordinates. 

9. Evaluate the mean free path for hydrogen, helium, and nitrogen molecules 
under standard conditions. Plot the mean free path of hydrogen molecules as a 
function of the temperature from C to 300 C. 

10. In van der Waal's equation of state (eq. 90) show that if the critical specific 
volume, critical pressure and critical temperature are employed as units of specific 
volume, pressure and temperature, respectively, the equation takes the form 



it being understood that one gram of gas is in question and that p f = p/p c , v' ** v/v c , 
and T = T/T C . For chlorine at 10 atmospheres and T = 293 K, find the values 
of v. Which value corresponds to the purely gaseous state? 



CHAPTER VI 
CLASSICAL STATISTICAL MECHANICS 

1. THE CHARACTERISTICS OF STATISTICAL MECHANICS 

The discussion of the preceding chapters on the application of the 
statistical method to physics may arouse wonder regarding the next 
step in this study. Conceivably we might develop the kinetic theory 
further. This, however, would necessarily involve the detailed study 
of forces between molecules, lying outside the purely statistical ideas. 
These matters are sufficiently discussed in professional treatises on 
kinetic theory. We have seen that a rational foundation for thermo- 
dynamics is provided by the Maxwell-Boltzmann statistics as treated 
in Chapter IV. This might appear to satisfy all our needs and render 
further investigation unnecessary save for the mere enumeration and 
solution of specific applications which after all are a part of thermody- 
namics anyway. 

The reader will certainly have noted, however, as a somewhat queer 
circumstance that whereas the kinetic theory employs mechanical 
ideas in its set-up, the Maxwell-Boltzmann statistics is essentially 
non-mechanical. Fundamentally all its results are based on the dis- 
tribution of objects or entities of any sort in groups with respect to 
some property or properties. Moreover the entities in question are 
assumed to be independent and without mutual influence. It is true 
we have introduced a few special applications of the Maxwell-Boltz- 
mann point of view to mechanical systems consisting of free particles. 1 
Further progress would now appear to be possible in the construction 
of a new point of view in which mechanical and statistical concepts are 
welded together from the beginning in a single unified theory. The 
postulates of such a theory, with which we shall associate the name 
statistical mechanics, will necessarily appear general and abstract. 
Nevertheless we can hope that they will serve as a rational basis for 
thermodynamics in the sense that the theorems deduced from them 
will form the laws of thermodynamics and that from the results of the 

1 Moreover in Sec. 9, Chapter V, we discussed the kinetic theory for a gas with 
interacting particles, but this transcends the strictly logical application of the 
Maxwell-Boltzmann statistics. 
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theory we can calculate (as averages, of course) the significant func- 
tions in the thermodynamical equations. If such a theory turns out to 
be successful, its very generality should guarantee its universal 
applicability. 

The first to found statistical mechanics in the formal sense just 
explained was J. Willard Gibbs. 2 We shall refer to the method of 
Gibbs as classical statistical mechanics and it will form the subject 
matter of the present chapter. With the advent of quantum theory, 
statistical mechanics has turned in a new direction which will be con- 
sidered in detail in the latter part of this book. The fundamental 
ideas of Gibbs, however, are as important today as they were in his 
own time and no serious student of statistical mechanics can afford to 
neglect them. 

Since statistical mechanics employs in its formulation the concepts 
of advanced mechanics and in particular the canonical equations of 
Hamilton, it will be desirable to review these briefly. 

2. REVIEW OF ADVANCED MECHANICS 

Consider an aggregate of n particles with masses m\ 9 m^- - -m n and 
position vectors ii, T% r n in some rectangular coordinate system. 
Thus, if Xj, yj, Zj are the rectangular coordinates of the jth particle 

TJ = ixj + jyj + kzj. 

It will be supposed that there are external forces acting on the particles; 
these are denoted by FI, F 2 F n . According to D'Alembert's prin- 
ciple and the principle of virtual displacements 8 the motion of the 
aggregate is given by the fundamental equation 

mA-Fy).firy = 0. (1) 

Here fay represents a possible, i.e., virtual, displacement of the jth 
particle subject to the constraints acting on the system. In general we 
distinguish between an actual displacement which a particle undergoes 
in time and a possible displacement which it might but does not 

2 " Elementary Principles in Statistical Mechanics," Yale University Press, 1903. 
See also his Collected Works, Vol. II, Part 1, 1930. This famous volume was once 
characterized by Henri Poincar6 as a "little book, little read because it is a little 
hard." It is indeed a model of conciseness. There has recently appeared a useful 
commentary on it by Arthur Haas in " Commentary on the Scientific Writings of 
J. Willard Gibbs," Vol. II, Yale University Press, 1936. 

3 Cf. Lindsay and Margenau, "Foundations of Physics," p. 102. 



104 CLASSICAL STATISTICAL MECHANICS [Cu. VI 

necessarily have, by denoting the former as usual by dij and the latter 
by bij. A brief examination discloses that 6 has the same formal prop- 
erties as a mathematical operator that d possesses. Thus we can 
write at once 

trtoj = | ft 'fry) -i($, (2) 

where vf = fy fy, and is the square of the resultant velocity magnitude 
for the jth particle. Therefore from (1) 

_ 

A r y = j 

where we have introduced T for the kinetic energy/ j - v?. If we 

integrate both sides of this equation with respect to the time from / to 
/i, we get 

T^ > C il r tl \r^ 

2. mjtj.dTjl = / dTdt + / > F r 5r ; rf/. (4) 

yTi J'o ^ A y-i 

Let us limit our attention to possible motions all of which have the 
same initial and final positions respectively; then $TJ will vanish for all 
j at both / and t\ and (4) will become 



r 

I 





(dT + SF r 5iv)<ft = 0. (5) 



Finally, let us assume that the forces are conservative. This means 
that a potential energy function V(xj y yj, zj) exists such that 

-->%+>%+*% 

Then since 

F dV dV 



we can write (5) in the more convenient form 4 

tl 



r t 

/ 



i(T - F)c == 0. (8) 



4 The mature reader will see at once the connection between (8) and Hamilton's 
principle. No discussion is needed here, however. Cf. the discussion in Lindsay 
and Margenau, op. cit., pp. 128 f. 
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The quantity T V is usually referred to as the Lagrangian function 
of the system and denoted by L. It is here a function of the coordinates 
and their velocities. 

We now find it convenient to abandon the rectangular coordinates 
in favor of generalized coordinates q\ <?/, which may have any 
character as long as a knowledge of them as functions of the time is 
sufficient to fix the position of every particle of the system at any 
instant. Their number / is called the number of degrees of freedom of 
the system. In the case of the aggregate of n particles, free to move in 
three-dimensional space, we clearly have/ = 3n. Any constraints on 
the freedom of motion of the system will, of course, decrease/. Now 
the q's will necessarily be related to the rectangular coordinates, and 
we shall express the relation in the following way 



Xj = xjqi #; yy 

Zj = Zj(qi #). (9) 

The rectangular velocity components become 



From (10) the kinetic energy T can also be expressed in terms of the 
generalized coordinates and generalized velocities q\ - - q/. It can be 
shown that T has the form of a homogeneous quadratic function of the 
<7i. Thus 



= / J auAifai 
k=i 



(11) 

where the a^ are functions of the qj. Clearly, since the potential 
energy V for a conservative system is a function of the Xj, yj, Zj only, 
it will also depend solely on the qj. The problem is to examine eq. (8), 
keeping in mind that L = T V is a function of the qj and qj. We 
could then obtain the Lagrangian equations of motion in the fashion 
followed in many books. Since these are not the ones useful for 
statistical mechanics, however, we proceed otherwise. 
Let us introduce the quantity pj defined by 

* - ^ (12) 

' *to 

This will be called the momentum conjugate to g/. It is, of course, a 
generalized momentum component and only reduces to the ordinary 
momentum of Newtonian mechanics when fa is an actual velocity. It 
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will turn out to be a more important quantity for our further con- 
siderations than qj. Next we bring in the function H defined in the 
following way 



pjqj-L. (13) 

y-i 

The function // is known as the Hamiltonian function of the system. 
From its construction it might appear to be a function of the #,, qj and 
pj. Actually it reduces explicitly to a function of the qj and pj. For 
let us find how H changes when the #y, qj and pj change by <%, dqj and 
dpj respectively. We have 

dH = V (pjdqj + qjdpj - Jp dqj - ^ c%) (14) 

f?f\ d( lj <% ' 

From the definition of pj in (12), the first and last terms in the paren- 
thesis cancel and we are left with 




_ , ( 15 > 

y-i 

substantiating our statement that // is an explicit function of the qj 
and pj only. We shall denote this by writing H in the form H(qj, pj) 
which is an abbreviation of the longer form H(qi - - - q/, pi pf). 
This notation will be used extensively in what follows. 
Equation (8) now takes the form 

i- - H(p j9 qj) \dt - 0, (16) 



= 0. (17) 




which immediately becomes 

r r^6 

J 2~fli*p}+ t 

Partial integration yields 




since all the Sq, and 5pj vanish at t and t\. Therefore (17) becomes 

//tli*Pi -/ Spj - /Pj^i - 7 SQJ \dt = 0. (18) 
, I fa fa Wi fa fa 9 * J 
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Now except for the restriction on the % and bpj at / and / lf they are 
otherwise completely and independently arbitrary. The only way in 
which (18) can be satisfied under these conditions is to have the inte- 
grand vanish identically for all % and dpj. The independence of the 
qj and pj then implies that we must have 



(19) 



These 2/ equations of the first order are known as the canonical equa- 
tions of motion, or alternatively as the Hamiltonian equations of 
motion. 

The connection between the Hamiltonian H(pj, qj) and the better 
known functions T and V becomes clear from its definition (13). Since 
V is independent of the q$, we have 



and (13) becomes 

v^-> AT 

n- T 4- V (?}} 

. y.j * i ' \^ L ) 

If we differentiate T in (11) with respect to g/, then multiply by qj 
and sum, the result is 6 

dT 



Hence 

H = T + V. (23) 

In other words, for a conservative system the Hamiltonian function is 
simply the total energy expressed as a function of the generalized 
coordinates and conjugate momenta. For example, for a simple har- 
monic oscillator with one degree of freedom 

H , + *, (24) 

2m 2 

6 This follows at once, of course, from Euler's theorem on homogeneous functions. 
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the mass of the system being m and the stiffness k. The Hamiltonian 
equations (19) reduce to two, viz., 



; = 4. (26) 

t/2" rrl 

The second equation is identical with the definition p = dL/dq, while 
the first equation is the ordinary Newtonian equation of simple har- 
monic motion. 

An important property of the canonical equations (19) is their 
in variance of form with respect to an arbitrary transformation of gen- 
eralized coordinates of the form 

Qj = CXffi #)> 3 = 1. 2 ,/. (27) 

If we denote by T' the kinetic energy in terms of Qj and ()y and define 
the new conjugate momentum P/ by 

Pj = "' ' (28) 

it develops that we still have 

= 6 

(29) 



where H f is the function of P/ and Qj obtained by substituting from 
(27) and (28) into the original Hamiltonian H(pj, qfi. The fact that the 
Hamiltonian equations are written in terms of generalized coordinates, 
of course suggests this invariance property. We shall find it useful in 
the development of statistical mechanics. 

3. PHASE SPACE 

The task of statistical mechanics is the description of the behavior 
of large scale bodies in the form of solids and fluids by assuming them 
to be dynamical systems with / degrees of freedom, where in general 
/ ^> 1 . The complete specification of the state of the system is given 
by the 2/ coordinates and conjugate momenta q\ - qf> pi pf. 
From this point of view the state is termed the phase of the system. It 
is customary to think of the 2/ quantities constituting the phase as 
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being represented geometrically by a point in a 2f dimensional orthog- 
onal space called the phase space of the system. Each system has its 
own phase space. As the p's and q's vary with the time the phase 
point moves and traces out a path in phase space. This path is not 
entirely arbitrary since the p's and g's satisfy the Hamiltonian equa- 
tions of motion (19). 

It will be worthwhile to illustrate by a special case which lends 
itself to easy visualization even though it does not satisfy the condi- 
tion / S> 1 and hence is not of much importance in practical applica- 
tions : the simple harmonic oscillator for which/ = 1 , Here H = p 2 /2m 
+kq 2 /2, where m and k are the mass and stiffness of the oscillator 
respectively, and the Hamiltonian equations become 



=- 

The phase space is two-di- 
mensional, viz., the pq plane. 
Every point of this plane 
represents a possible phase 
of a dynamical system of one 
degree of freedom. If the 
total energy of the oscillator 
is constant and equal to E its 
corresponding phase points 
are those lying on the ellip- 
tical path (cf. Fig. 6-1) 

2m 






(30) 




FIG. 6-1. 



(31) 



This ellipse has the semi-major axis \2E/k and the semi-minor axis 
V 2Em. At any particular instant the phase of the oscillator is repre- 
sented by some point on the ellipse. As time passes and the actual 
oscillator goes through its various phases in physical space we can 
think of the corresponding phase point as moving about the ellipse 
and repeating this motion periodically with the period equal to 
2irvm/k, the actual period of the oscillator; incidentally this is equal 
to the area of the phase ellipse divided by the energy. The rate at 
which the phase point traverses the ellipse is given by eqs. (30). If the 
energy does not remain constant, the phase path will be more compli- 
cated, but if E is restricted to lie between two values EI and E 2 , the 
path will certainly lie in the phase space between the two ellipses 
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corresponding to EI and E 2 respectively. We shall have occasion to 
utilize this situation later. However, for the moment it will be simpler 
to confine our attention to a conservative system. It will be noticed 
that the phase curve in Fig. 6 1 nowhere crosses itself. No phase path 
representing a dynamical system can ever cross itself; such a situation 
would violate the fundamental characteristic of a dynamical system 
that the precise knowledge of the q's and p's at one instant suffices to 
specify the phase for all past or future time. However, it is of interest 
to observe that since according to quantum mechanics this precise 
knowledge is never available (cf. the indeterminacy principle), infi- 
nitely sharp phase paths have no meaning in quantum theory and 
must be replaced by paths of finite width, or phase ribbons, as they 
may be called. In the present chapter we shall ignore this and restrict 
ourselves to the classical point of view. 

4. THE GIBBSIAN ENSEMBLE 

We are now prepared to introduce the concept of an ensemble. 
Let us imagine a collection of a great many dynamical systems all 
having the same Hamiltonian function. They might, for example, 
be a collection of simple harmonic oscillators, all with the same mass 
and stiffness. We shall suppose that their phases are not the same but 
are in fact distributed over a wide variety of possible phase values. If 
they are oscillators, their phases at any instant might correspond to 
points selected at random on the phase ellipse in Fig. 6-1. The collec- 
tion in question is termed an ensemble. It must be emphasized that 
such an ensemble is a purely mental construction and has no concrete 
existence. This introduces at once an element of abstractness into 
statistical mechanics not shared by classical particle dynamics. We 
are asked to contemplate at once a very large number of copies of a 
dynamical system distributed somehow over all the phases possible for 
such a system. The copies of the system composing the ensemble will 
be termed the elements of the ensemble. 

Each element in the ensemble is represented by a point in phase 
space. For the oscillator ensemble, these points form the loci of the 
ellipse in Fig. 6 1 if the energy of the system is fixed or, if it is not 
fixed, occupy the whole family of ellipses corresponding to the allowed 
range of E. Consider the latter case. In any given area of the phase 
space there will be at a given instant a certain number of phase points 
corresponding to elements of the ensemble. This enables us to intro- 
duce the concept of phase density. About any point in phase space 
we construct a small region and take the limit of the number of phase 
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points divided by the size of the region as the latter grows smaller. 
Clearly to give meaning to this quantity the number of elements must 
be very large and distributed more or less continuously. We shall 
refer to the phase density as D. 

Consider again our illustration of the oscillator and suppose the 
ensemble consists of N elements with energies included between EI and 
E 2 . The average density of the corresponding points in phase space 
will then be 



D = ^ E - 



(E* 

in the phase region between the two ellipses characterized by E 2 and 
EI respectively and zero everywhere else. The actual density may be 
constant throughout the region and equal to the average, or it may 
vary and be a function of p and q. For example, the density may vary 
directly with the energy, i.e., D = KE. Let us investigate this for a 
moment. The total number of elements of the ensemble is given in 
perfectly general fashion by 

N = J ' Dd*, (32) 

where d<f> is the element of volume of the phase space, and is given in 
general by 

dcj> dqi dqf dpi dpf. (33) 

The integration is carried out over the whole portion of phase space 
occupied by the ensemble. In the present illustration d<t> = dpdq and 
(32) becomes 



N = K I I Ed<t> 

*s *s 

the integration being taken over the whole of the appropriate phase 
space, i.e., the area contained between the two ellipses for EI and E 2 . 
We can write simply 

d<f> = 2ir ^ dE, 

* k 
and obtain 

-#2 

EdE = irl 

If we speak generally instead of a specific illustration, we may say 
that D will be a function of the pi p/, qi fy as well as the time, 
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and the number of elements of the ensemble in d$ will be 6 

dN = D(pj, y, t)d<f>. (34) 

If D does not depend on the time explicitly the ensemble is termed 
stationary. In general, however, we expect D to change with the time 
so that the phase points occupying at a given instant a certain portion 
of the phase space may be thought of as moving out of that region into 
another one with the passage of time. 

Since the element of volume d<t> is to play such an important role in 
statistical mechanics, it is of importance to see how it behaves with 
respect to an arbitrary transformation of coordinates in configuration 
space. Suppose we transform the phase variables q\ #/, p\ p/ to 
the system Qi Qf, PI P/, where 



Qi = 
Pi = 



f, Pi pf)- 



(35) 



From the property of the functional determinant already alluded to in 
eq. (64) of Chapter V, we have 



d(P l 



dP f 
P f , Q l 



Qi) 



dq r dp l - dp/. (36) 



n / , , \ "^ A 

d(pi pf, q 1 #/) 

The effect of the transformation will be clear from an evaluation of the 
Jacobian, which is of the form 

dPi dP f dQ l 
dpi dpi 



Qf) 



d(pi 



dpi 



dpi 



dPi 



dPfdQi dQf 
dpf dp/ dpf 

dQf 



(37) 



dQf 

d<?/ dqf dqf dqf 

But since - = for all j and k from 1 to / the determinant at once 
8 Note that D(pj, q 3 , /) is an abbreviation for D(pi /, 21 2/ f t) as usual. 
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breaks down into the product of two determinants, i.e., 





dPi 


dPf 




dQi 


dQf 


d(Pi <?/) 


dpi 


dpi 




dqi 


dqi 


d( Pi q/) 


dPi 


dPf 




dQi 


dQf 




dPf 


dpf 




dqf 


dqf 



(38) 



Let us denote the transformed Hamiltonian by K(P\ - - P/, Qi <2/). 
Then from eqs. (12) and (20), since the system is assumed to be 
conservative, 



Consequently we have 



d 2 K 



which makes it possible to write 

d -s dK dpi _ d dK 



(39) 



(40) 



since the Qi are linear functions of the p's with coefficients involving 

the q's and therefore is a function of the q's alone. But = 

dpk opk 

= qk and hence (40) yields 

dpk 



dgfc 



(41) 



But 



whence 



dp k 



(42) 



The expression on the right of (38) then becomes 



dqi 


dqf 




dQi 


dQf 


dQi 


dQi 




dqi 


dqi 


dqi 
dQf'" 


dq/ 
dQf 




dQj 
dqf 


dQj 
dqf 
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Now, from the general rule for multiplying determinants, if we denote 
by dj the ij term in the product determinant, we have 



(43) 



-j dQi dqk dQi 



where Sji is the Kronecker symbol which equals unity for i = j and is 
zero otherwise. Hence the product determinant is just the diagonal 
determinant 

1 



0100 







0000 



1 



whose value is unity. Since the functional determinant is unity we 
have demonstrated the invariance of the volume element d<t> in phase 
space with respect to an arbitrary point transformation. 




5. LIOUVILLE^S THEOREM. THE EQUATION OF CONTINUITY IN 
- ~~ - STATISTICAL MECHANICS 

Let us now consider a region of phase space and the "flow" of 
phase points through it. We can think of this region as enclosed by 
a hyper-surface whose "area" element will be denoted by da. We 
then generalize the divergence theorem of ordinary three-vectors to 
the 2/ dimensional phase space. Thus for a generalized vector function 
of the phase space, V, with components F gi , F fi2 V qf , V Pl V p/ 
we define the divergence of V as 



(44) 



If we denote by V n the component of V normal to da the divergence 
theorem becomes 






where the "volume" integral on the right is taken over the whole 
phase region under consideration and the "surface" integral on the 
left is taken over the hyper-surface enclosing this region. Now apply 
this to the case where V is the vector with components Dq lt Dq n , 
Dpi Dp n . Thus Dqi is the rate of flow of phase points per unit 
area in the qi "direction." The vector V may then be said to represent 
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"phase flow," and / V n d<r represents the net phase outflow from the 

phase region bounded by the hyper-surface over which the integration 
is taken. But this phase efflux must be compensated by a change in 
phase density inside the volume. In symbols 



** < 46) 

whence from (45) 




Since the phase region over which the integration is extended is arbi- 
trary, the result in (47) may be written in general 

_ 

" (48) 



It will be noted at once that this is analogous to the equation of con- 
tinuity in hydrodynamics. It can be written in physically more 
significant form by carrying out the indicated differentiations. Then 



dt dqj dpf fV dgj ) dp, 

But from the canonical equations (19) it follows that for any j 



dD dD . \ 

to + Pj) 

dqj dpj / 



Moreover 



is the time rate of change of D due to the changes in the p's and g's. 
Consequently (48) becomes 

f - . 

which says that the total time rate of change of Z), the phase density, 
vanishes. This theorem, known as Liouvillels. theorem, means that to 
one following the motion of the ptTaee-points in an ensemble the density 
does not change with the time. At any given place in phase space, the 
density may change, to be sure, but what may be called the motional 
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change vanishes. Thus if we consider a certain region of phase space 
containing at a given instant a certain number of phase points, during 
the passage of time these points move in such a way as to occupy an 
equal phase volume at every instant, even though the "shape" of the 
volume may alter. This result is referred to by Gibbs as the principle 
of conservation of density-in-phase% 

The importance of this theorem for statistical mechanics can 
scarcely be over-emphasized. Not only is it basic for the further 
theoretical development but it has also been used directly to simplify 
physical problems. One of these is the study of the motion of electrons 
in the earth's magnetic field which is of great significance for the 
investigation of cosmic rays. 8 

Gibbs also has shown how Liouville's theorem may be used with 
little or no further mathematical manipulation to prove the invariance 
of the element of phase volume. 9 

3D 

In the special case of a stationary ensemble = 0, and the theorem 

dt 

becomes 

- 

It can be shown that if D is a function of the energy E alone (50) 

r)D 

follows directly. Hence Liouville's theorem will lead to = 0, i.e. 

dt 

the ensemble is stationary. The proof follows. Here 

6.D __ 3D dE ' dD _ 3D dE 
dqj """ dE dqj ' dpj ~~ dE dp/ 

But from the canonical equations 

dE dH . dE dH 

= _ p and = = fa, 

dq, dqj y ' dpj dpj * 3 ' 

Hence 




and therefore (SO) follows. Therefore an ensemble for which the phase 
density is a function of the energy alone must be stationary. 

7 For an alternative derivation which is purely analytical in character, cf. Lindsay 
and Margenau, op. cit. t pp. 221 ff. 

8 Cf. G. Lemaitre and M. S. Vallarta, Phys. Rev., 43, 87 (1933). 
9 Gibbs, op. cit., p. 11. 
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A simple illustration of Liouville's theorem is provided by a system 
consisting of a single charged particle with charge e moving in a uni- 
form electric field of intensity F. The Hamiltonian function for such 
a particle is 



where it is assumed that the field is directed along the positive q axis. 
An ensemble representing such a system with constant energy EI 
would have elements corresponding 
to the phase points located on the 
parabola E\ in the pq plane (cf . Fig. 
6-2). Let us, however, imagine an 
ensemble with phase points located 
in the phase region fa between the 
two parabolas corresponding to total 
energies EI and E 2 respectively and 
between the parallels to the q axis 
defined by the values pi and p 2 
respectively. If the phase values 
of the elements of this ensemble 
correspond to t = 0, at the expira- 
tion of time / the phase points of 
the ensemble for the system will no p IG g-2. 

longer lie in fa. Rather they will 

have moved into the new phase region fa bounded by the momen- 
tum values pi and p 2 , where 




P'l = Pi + 



P2 = P2 



From the canonical equations, p = eF. Now the phase 'Volume" 
occupied by the ensemble at / = is the area 

(E 2 - 



fa = 



eF 



(P2 - Pi), 



while the phase "volume" occupied by the ensemble at time / is 

/ 77 77 \ 

\-*-'2 *- J \) f f 

eF 

But from the expressions for p{ and p 2 it follows that p 2 pi = 
p2 ~~ Pi an d hence 4> 2 = fa. As time passes the same set of phase 
points occupy equal "volumes" of phase space. Therefore the density- 
in-phase must remain the same, as Liouville's theorem requires. The 
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reader will note that the shape of the successive phase regions occupies 
by the ensemble changes though the "volume" is invariant. 

6. THE FUNDAMENTAL POSTULATE OF STATISTICAL MECHANICS 

It is now necessary to provide some connection between th 
ensemble, which is a purely abstract mental construct, and the dynam 
ical system whose behavior the ensemble has been created to describe 
We do this by means of a fundamental postulate, the statement o 
which, however, demands the introduction of an additional concept 
This is the phase probability. If we have N elements in the ensembl 
and the phase density is Z>, the probability of choosing at random ai 
element whose phase is included in the region d<t> about the phase poin 
ffi * ' * fZ/ Pi * ' * P/9 i s defined to be 

Pd<t> = D/N-dfi, (51 

where P(gy, pj, f) is called the probability coefficient, and we obviousl; 

have D(<b' t) 

P(pj, QJ, t) = J ' } ' (52 

Evidently from (32) N 

r 1 r 

= 1. (53 

That is, the probability of choosing an element at random from th 
whole phase space of the ensemble is unity. 

The basic postulate now runs as follows : The probability that at ; 
given instant, /, the physical system being described shall have it 
phase included in the region d<f> about the value q\- - <?/, p\- p/ i 
the same as the probability Pd$ of choosing at random from tb 
ensemble corresponding to the physical system an element included ii 
the phase region d<t>. It is essential here once again to distinguisl 
between the actual physical system which assumes its possible phas 
values one after another as time passes and the imaginary copies whicl 
compose the ensemble, whose elements correspond to all possible phas 
values of the system visualized simultaneously. 

The practical value of the basic postulate is that it provides ; 
definite physical meaning for the average over the ensemble of an; 
physical quantity characteristic of the system being described. Thu 
if x(<Zi* ' <Z/ Pi' ' * Pf) is a physical quantity which is a function o 
the q*s and p's of the physical system, we shall define the average of ; 
over the ensemble as 

X = I xPd<t>, (54 
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the integration being taken as usual over the whole phase space occu- 
pied by the ensemble. From the fundamental postulate it now follows 
that x is the value of the quantity x which we should expect on the 
average to measure for the system when it is in a state of equilibrium, 
viz., the only state in which we can contemplate the act of measure- 
ment. The calculation of average values over the ensemble will then 
be the principal task of statistical mechanics, for these averages are 
taken as the actual physical values encountered in measurement and 
entering into the laws characterizing the experimentally observed 
behavior of the system. The essential difference between the average 
over an ensemble and the kind of average considered in the statistical 
distributions of Chapter IV jshould be carefully noted. Thus, for 
example, the average energy E in eq. (30) of Chapter IV refers to the 
average energy per particle in an actual physical aggregate of particles 
canonically distributed. The average defined by (54) is on the con- 
trary an average of the quantity in question for the whole physical 
system described by the ensemble. This point will be illustrated and 
emphasized again in the study of special types of ensembles (cf. the 
end of Sec. 9 of this chapter). 

7. THE MICROCANONICAL OR ENERGY-SHELL ENSEMBLE 

It is now necessary to consider some special types of ensembles, for 
clearly the fundamental postulate of the previous section can not be 
applied unless we have a definite distribution of elements to which to 
apply it. Let us consider the ensemble whose elements have energies 
lying in the interval from EQ to E Q + AE, for which in other words 

Eo ^ H(q r 2,, p r /)< Eo + AE. (55) 

This interval constitutes in phase space what may be called an energy 
shell. We shall assume that the phase points of the ensemble fill this 
shell everywhere densely. The ensemble itself is usually called a 
microcanonical ensemble, a name whose significance will be understood 
better later. As a matter of fact it might more appropriately be 
termed an energy-shell ensemble. 

Now from Liouville's theorem dD/dt = 0; if further we suppose 
that the ensemble is stationary, i.e., dD/dt = 0, it follows from (50) 
that there is no change in phase density along any phase curve. Hence 
along every phase curve the density remains constant both in space 
and time. From this we infer that D has the same value throughout 
the energy shell. To be sure, in order to reach this conclusion, we have 
made a tacit hypothesis, namely that every possible phase curve of the 
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system fills the shell densely or at any rate that each such phase curve 
passes infinitely close to every point of the shell. This is the cele- 
brated ergodic or quasi-ergodic hypothesis. It has received much 
attention in discussions of statistical mechanics. It is not difficult to 
show that the strict assumption that the phase curve passes through 
every point in the energy shell or on the energy surface leads to logical 
contradiction. The strict e rgodic jhjqpo thejsis is therefore usually 
replaced by the less stringent quasi-ergodic hypothesis that ultimately 
the phase curve passes as close as one requires to every point in the 
shell. Applied to the actual physical system the hypothesis effectively 
means that the system will ultimately get infinitely close to every 
possible phase value consistent with the restriction to constant energy 
or limited energy variation. So stated, the assumption has a plausible 
ring. The purely mathematical problems which it poses, however, are 

considerable. For a consideration of these 
the reader is referred to P. S. Epstein. 10 
Let A< be the phase volume of the 
energy shell. What we wish to do is to 
specialize eq. (54) so as to obtain a con- 
venient expression for the average of any 
function of the phase over the micro- 
canonical ensemble. We construct the 
diagram shown in Fig. 6 3 where a por- 
tion of the energy shell is shown sche- 
p G 5.3 matically. Here d<r represents a portion 

of the energy "surface" defined by 

H(pi- -pf, q\- - q/) EQ and As is the "normal" to this surface 
which extends to the other bounding surface of the phase volume 
occupied by the ensemble, viz., H = EQ + AE. Then for the element 
of phase volume of the ensemble we may write 

dj> = As -da, (56) 

where further A< = / d<t> over the space between the energy surfaces. 
We now get * 




*-; (57) 

since / Dd<l> = N 9 where N is the total number of elements in the 
ensemble, and we can take D outside of the integral sign because of 

10 "Commentary on the Scientific Writings of J. Willard Gibbs," pp. 465 ff. 
Yale University Press, 1936. 
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its constancy. But now we can write to an approximation which 
improves as AE grows smaller 

~ = \VH\ 9 (58) 

where | V/7 1 is the absolute value of the generalized gradient of the 
Hamiltonian, viz., 




\VH\- 



Its value is of course taken at the energy surface. The volume 
the energy shell depends on E and AE. In fact we can write 



As a result of Eqs. (56-60) 



Pd$ = 



(59) 

\of 

(60) 

(61) 



This enables us to express the average value of any phase function 
x(<Zi ' * <Z/> Pi ' ' ' Pf) over the microcanonical ensemble in terms of 
an integral taken over the inner surface of the energy shell. Thus 
from (54) 



(x) D 



J 0) 



(E)|V//J 



(62) 



Before going further with the general treatment, let us specialize 
the above to the case where 
the dynamical system is a 
simple harmonic oscillator. 
This enables us to visualize 
a little better what we are 
doing. In Fig. 6 4 we re- 
present the energy shell for 
the ensemble as the plane 
area between the two el- 
lipses whose equations 
are H = p 2 /2m + kq*/2 = 
Eo and p 2 /2m + kq 2 /2 - 
E + AE. The element of "surface" d<r of the shell now becomes the 
element of arc of the ellipse, viz., 




FIG. 6-4. 



(63) 
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Moreover 



o>() is obtained from the dependence of <f> (here area) on the energy. 

Thus since the area of the ellipse for is 2?r A/~ i the size of the 
shell is given by 



= 27r \/T 
11 



( 65 ) 



whence co (E) = 2w\^m/k = constant. We are now prepared to write 
Pd<t>. Thus 



j 
V 1 



r 

and 

d<r m dq 



V2mE - mkq 2 
Hence 



V2mE - mkq 2 



(66) 



It is interesting to observe that Pd<f> reduces precisely to c?///o where /o 
is the period of the oscillator and dt is the time spent in the interval dq. 

As a check on (66) we can show by direct integration that / Pd<t> 




equals unity when the integration is extended over the whole shell. 
The evaluation of the microcanonical average kinetic and potential 
energies is of interest. Thus 

..._. ....._ , -* < 67> 

In similar fashion 

() = T (68) 

\ * / me * 

Thus for the simple harmonic oscillator the averages of the kinetic and 
potential energies over a microcanonical ensemble are each equal to 
one-half the characteristic energy of the ensemble. This reminds us of 
the ordinary mechanical theorem that the time average of the kinetic 
energy of a simple harmonic oscillator is equal to the time average of 
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the potential energy. The connection here is indeed an illustration of 
the basic postulate discussed in Sec. 6. 

It will be observed that in the above example we can write p 2 /2m as 

- and kq 2 /2 as - . This suggests a possible generalization as 
2 dp 2 dq 

follows : Let us form the average of the quantity 

dH 



X = Qk 
Then 



U) mo = -^ A^-n^r, (69) 

U\H,Q) */ 0(ZA; 



d<r 

'Wa\' 



where co( ) has been removed from under the integral sign since it has 
the same value at all points of the energy surface. 
Now 

dH 




may be interpreted as the cosine of the "angle 11 between the normal 
to the energy surface and the q^ "direction." This follows from the 

dH dH 

fact that in analogy to the ordinary theory of surfaces , 

dH dH qi q/ 

- . . . are proportional to the "direction cosines" of the normal 
dpi dp/ 

to the surface 

H(pi - pf,qi - q/) = E . 

dH do- 1 . . . . 

Consequently we may look upon p i as the projection ot da on 



the 2/-1 dimensional "plane" normal to the q k direction. Therefore 



may be interpreted as the "volume element" d<j>* of the 2f dimensional 
phase volume enclosed by the energy surface. The same will hold true 
of 

dH do- 
Pk 
Hence for both we shall have 
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Here <* is the total phase volume enclosed by the energy surface. 
We can make an immediate interesting application. For pk 



dpk 

from the canonical equations (19), and ^pk^k H + L = 2T, where 
T is the total kinetic energy. Hence we may refer to pkQk as 2T k , or 
twice the kinetic energy associated with the kth degree of freedom. 
From (70) we therefore have 



But since the right-hand side is constant, this implies the equipartition 
of kinetic energy among the degrees of freedom of the system. 

r)7T 

Consider further g& - = qkpk (from 19). If we treat pk as the 
dqk 

generalized force associated with the kth degree of freedom, eq. (6) of 
Chapter V suggests that we may define 



= - |(S) mo (72) 

as the generalized virial of the dynamical system under consideration, 
since it is formed in precisely the same way as the virial of Clausius for 
an actual physical system of particles. But from the theorem (70), it 
follows that 

mc (73) 



or the generalized virial is equal to the microcanonical average of the 
kinetic energy of the system. This is the statistical mechanical 
version of the virial theorem. 

The reader will find it instructive to verify this result for the simple 
harmonic oscillator. 

8. COMPONENT SYSTEMS 

The physical system represented by the ensemble has so far been 
thought of as a unitary whole. We now wish to resolve it into a num- 
ber of component systems assumed to be independent of each other, 
save for the possibility of energy exchange. Thus if the original system 
is the aggregate of all the molecules composing a given mass of an ideal 
gas, a component system might be an individual molecule. Denote the 
resultant system by S and its k components by Si, S 2 , S&. If Sj 
has// degrees of freedom and/ is the total number of degrees of freedom 
of 5 







//=/. (74) 



SEC. 8] COMPONENT SYSTEMS 125 

It should be clear that the same phase space cannot serve for the com- 
ponent and resultant systems. For the resultant system we shall 
employ what is called a 7 space, for the component systems the spaces 
will be denoted by jii, ^2 Mfc- The phase space ji/ has 2// dimen- 
sions. If we are dealing with molecules in the form of particles with 
three degrees of freedom, each /* space has six dimensions, and if 
there are N molecules in the resultant system, the 7 space has 6N 
dimensions. 

Though we shall conceive the system S to be closed and hence 
representable by a microcanonical ensemble, it would restrict matters 
too much to assume the same for the components, since we should like 
to think of the components as having the possibility of changing their 
energy by transfer from one to another without altering the energy of 
the whole. For this reason we shall not be able to construct micro- 
canonical ensembles for the component systems, and the phase points 
for Sk may occupy any part of /u^. But the phase probability P for 
the resultant system will presumably somehow depend on the instan- 
taneous energy of the system and we shall in fact assume that P is a 
function of E alone. For each component system there will also be 
a phase probability Pj = Pj(Ej) where Ej is the (variable) energy of 
the jth system and SJEy = EQ. If the ensembles representing 5 and Sj 
are stationary, P and Pj will be independent of the time. The problem 
now is to find the functions Py(JEy). The probability that the phase 
points for system Si lie in dfa of space ;*i, those for Sj in dfy of 
space /iy, etc. will be by definition 

P l d<t> 1 'P 2 d<t> 2 Pkdfa. 

But this must equal the probability that the ensemble for 5 shall lie in 
d<t>, since the systems have been assumed to be independent. Hence 



PiP 2 Pkdfa - - d<j> k = Pd$. (75) 

Since d<}> = dfa'dfa dfaj we get the functional equation 

Pi(JSi) ..... Pk(E k ) = P(E l + E 2 + + E k ). (76) 

We can write (76) in the form 

logP, (77) 
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and if we alter E; by dEi subject to the condition that dZEi = dE Q = 0, 

the result is 

k 

*]T}log Pi = d log P, 
ti 

~ 



dP 

By equating corresponding terms and recalling that - is the same 

dh>i 

for all i since P is a function of the total energy only, we arrive at 
1 dPi 1 dP 2 1 dP k 



where Cis a negative constant and will be written in the form 1/0. 
Therefore for any j from 1 to k the dependence of Pj on Ej takes the 
form 

PJ = Cje" Ej/e . (80) 

The Cj are multiplicative parameters to be evaluated by further con- 
ditions placed on the Pj. 

9. THE CANONICAL ENSEMBLE 

The immediately preceding considerations at once suggest a new 
type of ensemble in which the energy is not restricted to lie in a shell 
but is allowed to vary continuously, and in which the fractional number 
of elements of the ensemble per unit volume of phase space is not 
constant but varies with the energy. Thus following Gibbs we shall 
introduce an ensemble such that 



Pd<t> = Ce" n(Ql '"^ pi '" pj)/s d<t> y (81) 

with the probability coefficient 

p = e (+-W/ t (82) 

We have here placed C = e* /e by the introduction of the new param- 
eter \l/. The meaning of (81) and (82) is this: For a given system 
the Hamiltonian function is a certain function of the q's and p's. 
The probability of picking at random an element of the ensemble lying 
in the phase element d<fr about the point (q\ <?/, p\ />/) depends 
on the position of this point through the Hamiltonian. Consider as an 
example the simple harmonic oscillator. Here 

Pdpdq = e^ /s e 
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For small absolute values of p and q, i.e., small total energy of tl 
oscillator, P will be larger than for large values of p and g, whic 
correspond to larger total energy. 

In following Gibbs we shall call an ensemble for which the prol 
ability coefficient has the form (82) a canonical ensemble. Its relatic 
to the canonical distribution discussed in Chapter IV will be discusse 
in the sequel. is defined as the modulus of the ensemble. The secon 
parameter \l/ can be expressed in terms of @ by the relation (from 53) 

(8; 

the integral being taken over all the portion of phase space occupie 
by the ensemble. There would appear to be a certain difficulty aboi 
this equation. Inspection shows that it is not dimensionally correc 
since on the left we have e~^ /e , a pure number, while d<f> on the rigl 
has the dimensions of phase space. We can correct this situatio 
merely by thinking of d(f> as a pure number giving the ratio of th 
genuine volume element in phase space to an arbitrarily chosen un 
volume. This will still enable us to compute ensemble-averages b 
means of eq. (54). 

In the canonical ensemble we have a means of representing statist 
cally a physical system of variable energy, i.e., an unclosed system lit 
the component systems Sj mentioned in Sec. 8. We can then loo 
upon the aggregate of such systems 5y, each represented by a canon 
cal ensemble, as a system of constant energy EQ represented by 
microcanonical ensemble. From this point of view a canonical ensembl 
may be interpreted as a component part of a microcanonical ensembh 
There is, however, another possible point of view which considers 
microcanonical ensemble as a part of a canonical ensemble. Suppos 
all the component systems Sj are dynamically similar with the sarr 
number of degrees of freedom, etc., e.g., similar molecules. Then the 
may all be represented in a single phase space which we may call th 
/i space. We may now divide the p, space into a set of energy shel 
within each of which the phase density is constant but with P varyin 
from shell to shell according to (82). Here the microcanonical ensen 
ble for each system Sj appears naturally as a component part of 
canonical ensemble for the same system. To a certain extent th 
justifies the name associated with it. 

Let us now find the probability that the element of a canonic; 
ensemble shall correspond to energy lying in the interval from EQ 1 
EQ + AJ5. By the fundamental postulate of Sec. 6, this is equal to tl 
probability that the dynamical system represented by the canonic; 
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ensemble shall be found in such a phase interval that its energy lies 
between EQ and EQ + AE. But this probability is simply 

(^-^ )/8 A , (QA\ 

V -<P) \ */ 

where A< is the elementary volume in phase space corresponding to the 
energy shell in question. 

Rather more important is the calculation of the average value of any 
dynamical quantity over a canonical ensemble. Thus if x(<Zi 2/ 
Pi ' ' ' Pi) ls such a quantity, we have from (53), (54), and (82) 

(85) 

Since (85) eliminates the parameter \fs it is often the more convenient 
form. The integrations in both numerator and denominator are taken 
over the whole of the phase space occupied by the ensemble. 

As an illustration consider the average energy of an oscillator. 
It will be simplest to express the integrations in terms of the energy 
itself by using the energy shell as the elementary phase volume and 
writing d<j> = 2ir\/m/k dE. Then 



(E) c = - (86) 

e~ B/ dE 



For the sake of simplicity we use the limits and oo , though strictly 
speaking the upper limit should be finite. Because of the exponential, 
however, the error due to this change will be slight. The result of the 
integration turns out to be __ 

(E) c = 0. (87) 

Incidentally the reader may show that the same result is obtained 
from the direct expression, using the Hamiltonian of the oscillator, 

+ 00 

(p 2 /2m - 



ff 



dp dq 



In attempting to generalize the preceding results we proceed to find 
the canonical average of <*/w, where 0* is the total phase volume 
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enclosed by the energy surface H = E and <a(E) = d<j>*/dE (cf. eq. 60). 
Thus 



- 

J$ 



*d<t> 



where $> is the total phase volume over which we are taking the av- 
erage. If we now replace d<t> by c*)(E Q )dE in the numerator, the ex- 
pression (88) becomes 



f 

Jo 



(89) 



A partial integration of the integral in the numerator yields 

l^max 
jjJe/T^,, -ff/0 I I \ 

<p \y(> i ~T~ \y 

/ ,* / \ -I o 



(90) 



If now we take max large enough the integrated part will vanish since 
the term e~" E ^ will wipe out <*, as long as 0* depends algebraically 
on JS, which will be true in practice. Moreover in the integral in the 
denominator we may replace d<t> by d<t>* merely by a transformation of 
coordinates, which results, for integration over a sufficiently large 
volume of phase space, in 

(*Y)c = (91) 

This provides an interpretation of the modulus of the canonical 
ensemble. Now in eq. (71) we have already shown that <*/cu is itself 
the average of 2Tj (or twice the kinetic energy associated with the jth 
degree of freedom of the system) taken over a microcanonical ensemble 
with elements filling the energy shell ^o + AE enclosing the phase 
volume 0*. What we have shown in (91) then reduces to this: the 
canonical average of the microcanonical average kinetic energy corre- 
sponding to each degree of freedom of the system is equal to /2, i.e., 

)c = ~ = ZV (92) 



We use the single bar to denote the generalized, average of Tj. Equa- 
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tion (92) signifies equipartition of kinetic energy among the degrees of 
freedom of the dynamical system. The generalized average of the 
total kinetic energy becomes / /2, if there are / degrees of freedom. 
Our present procedure amounts to thinking of the canonical ensemble 
as consisting of a set of microcanonical ensembles. The generalized 
average then naturally is the canonical average of the microcanonical 
average. 

It is important at this state to compare the meaning of the canonical 
average of a physical quantity, like the energy, over the ensemble 
representing the system, with the averages encountered in the simple 
statistical considerations of Chapters II and IV. In the so-called 
canonical distribution of Chapter IV, we found expressions for the 
average energy, e.g., eq. (30) of that chapter. This represents the 
average energy per particle of a dynamical system of particles when 
canonically distributed. For example, in a system of free particles, as 
in Sec. 5 of Chapter IV we found the average energy to be E = 3/2 
(eq. 86 of Chapter IV), where is the modulus of the canonical distri- 
bution. Now it must be emphasized that the average energy over a 
canonical ensemble means something quite different from this. The 
average of a quantity over an ensemble refers, as has already been 
stated in Sec. 6 of this chapter, to the expected measured value of the 
quantity for the entire dynamical system in a state of equilibrium. 
Thus, in so far as a canonical ensemble is a suitable statistical repre- 
sentation of a system in equilibrium, the canonical average of the 
energy is the expected value of the energy of the system when we 
measure it in a state of equilibrium ; it is not the average energy per 
component particle of the system. It is essential that this be realized 
in order that a correct understanding be had of the relation between 
the Gibbs ensemble method and the Maxwell-Boltzmann method of 
statistical distribution. In the latter method we deal with systems in 
which the total energy is constant and given and we are interested in 
finding out how the particles of the system are distributed with respect 
to their energy. The canonical distribution is that corresponding to 
maximum probability of realization subject to the constancy of the 
total number of particles and the total energy. We hoped to find and 
did indeed find relations among the various quantities defining the 
canonical distribution which correspond to known thermodynamical 
laws. In the Gibbs statistical mechanical method the energy of the 
system is no longer looked upon as an absolutely constant quantity. 
It may indeed fluctuate, but only about an average which it is the 
task of the theory to calculate. This average should be the experi- 
mentally measured value. It is also the further task of the theory to 
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develop relations among quantities defined by the ensemble which will 
correspond to the laws of thermodynamics. The virial theorem and 
the equipartition of energy are cases in point. Further illustrations 
will be developed in what follows. 

In his development of statistical mechanics Gibbs confined himself 
largely to the use of the canonical ensemble, and we shall follow this 
practice in the remainder of this chapter. As the reader has already 
observed from Sec. 8 the canonical ensemble arises naturally as a 
suitable representation of a component system whose energy may 
change by exchange with another component of the same macroscopic 
system, it being understood that the energy of the latter remains 
constant. The canonical ensemble is therefore well fitted to represent 
a system whose energy varies. However, it will later become clear 
(cf. Sec. 11) that it is also able to provide a very satisfactory statistical 
representation of a system whose energy is constant. The canonical 
ensemble is subject to easier analytical manipulation than the micro- 
canonical ensemble. If it can accomplish the same descriptive ends as 
the microcanonical ensemble its use will lead to considerable economy 
of thought. We may make one further remark at this point. We shall 
later associate the physical concept of temperature with the modulus 
of the canonical ensemble. This type of ensemble therefore repre- 
sents a system at constant temperature. 

10. THE MAXWELL-BOLTZMANN DISTRIBUTION LAW 

One of the searching tests of any statistical theory of dynamical 
systems is its ability to lead to the proper law for the distribution of 
velocities. We have now to examine this problem from the standpoint 
of Gibbs' statistical mechanics. We must first construct a canonical 
ensemble for a system of n free particles all of mass m having /( = 3n) 
degrees of freedom. The Hamiltonian for such a system is 



+ & + fo /2m - (93) 

y-i 

The phase probability in the corresponding canonical ensemble is now 
(writing pi = plj + p z vj + Pi, for short) 



dx . dy . dz . dp,f dp vj 



n 

I (2/) / e - zp ? /2me JJ d*j dy f dzf JJ dp xj dp vi dp ti 
J J - - 
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where the integral in the denominator is 2/ fold. This, of course, 
gives the probability associated with an ensemble element of vol- 
ume d<t> in the neighborhood of the phase x\, y\, z\, p x \ y p y i, p z i, - 
%n, ynt z n , pxn> Pyn, pzn- From the fundamental postulate of Sec. 6, 
this is the probability of finding the actual system of / degrees of 
freedom in the state for which the first particle has coordinates in the 
neighborhood of x\,yi, Zi and momentum components in the neighbor- 
hood of p x i, p y \j p z \t and correspondingly for the other particles. 

Now we actually wish the expression for the probability that any 
one particle (for simplicity we shall take the one denoted by sub- 
script 1) shall have its phase as indicated while the other particles are 
distributed in any fashion whatever consistent with the given total en- 
ergy and finite volume. From the fundamental postulate and eq. (94) 
this will be 



Pid<t>i = 

//" 
(2/-6) / 



(95) 

where A is the same denominator as in (94) and dfa dxidy \dz\* 
dpx\dp y idp z \. Note that the integral in the numerator is now (2/ 6) 
fold. If further we wish the probability that the first particle shall 
have its momentum components in the neighborhood of p x \ y p y i, p z \ 
while its coordinates are unrestricted we have further 



dp yl dp zl f(f-3) 
x 



dpxj dp y j dp z j 
J=l 

The integrals in (96) are easy to carry out by iteration of the well- 

/*/*/* 
known integral / / / <T (p *+ p 2+^ )/2m dp x dp y dp z = (2Trm@) H . The 

result is that 

In place of the first particle we might have taken any particle. We 
therefore reach the conclusion that the average fractional number of 
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particles having their momentum components in the interval p x ,p x + 
dpx, ' ' becomes 

J 

/o- /t/i/t/w^ fy""~ (p*"i~P/"f*P*)/2wiO j L. j L. j L. /r\Q\ 

(Zwinvy) e * y * ap x ap y apt. l"w 

n 

If further we replace p x by mv x , etc., this would take the precise form of 
the Maxwellian velocity distribution (50) in Chapter V, provided in- 
deed we are entitled to replace the characteristic parameter @ of the 
ensemble by kT. The temptation is therefore strong to look upon 
(97) and its equivalent (98) as the statistical mechanical version of the 
Maxwell-Boltzmann distribution law and to consider the parameter 
as the statistical mechanical analogue of something proportional to the 
thermodynamic temperature. 

It is worth while pointing out that the restriction to absolutely free 
particles is not necessary. We can introduce a potential energy func- 
tion V(xj, yjj Zj) into the Hamiltonian. As long as it is a function of 
the coordinates only, its influence will cancel out in the integration 
over the coordinates and the final result will still be (97). Of course we 
can then no longer maintain that any position coordinates for the 
system are as likely as any other : as a result of the forces on the system 
certain parts of configuration space will in general be preferred by the 
system, depending on the nature of V(xj, yj, zy). 

A further observation is in order on the statistical mechanical 
interpretation of the Maxwell-Boltzmann distribution law. In the 
immediately preceding discussion we have tried to develop the analogy 
with kinetic theory by constructing a Gibbs ensemble for a whole 
system of free particles. This seems to be the natural course to pursue, 
but it is interesting to observe that we can achieve the same result by 
forming the ensemble for a single particle. Let us see how this comes 
about. The Hamiltonian for a system consisting of such a particle of 
mass m, having kinetic energy only, is 

// = (Pl + Pl + *S)/2i. (99) 

The phase probability for the corresponding canonical ensemble is 



r r c r r r 
JJJJJJ 



It must be emphasized that in constructing the canonical ensemble for a 
single particle we cannot assume that the energy of the particle is 
fixed. By the very nature of the canonical ensemble we must assume 
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the possibility that the energy of the system being described shall be 
able to fluctuate. The particle we are here talking about is one whose 
energy can then have any value. It cannot therefore be considered 
free in the usual dynamical sense, even though we are treating its 
energy as wholly kinetic. 

If we carry out the integration in the denominator of (100), we can 
write 

e -(p x +P v + P z )/2mQ T ^ z ^ ^ ^ 

Pd<t> = y 

T\t 

where r is the physical volume in which the particle is confined. If 
further we require the probability that the phase point have momentum 
values p xt p y , p z in the indicated interval with no restriction on the 
coordinates x, y, z, we obtain 

P'dtf = (27TW0)" V" (p * + pJ+$/ 2we dp x dpy dp z . (102) 

This is identical with eq. (98) which gives the average fractional num- 
ber of particles in a whole aggregate of free particles having their 
momentum components in the indicated interval. It therefore appears 
that we need not have considered the whole aggregate in forming the 
canonical ensemble. A single member suffices. Closer inspection of 
the situation indicates that the reason for this is that the particles in 
the aggregate are free and do not affect each other. If the particles 
were to act on each other with forces, the Hamiltonian would no longer 
be of the simple form (93) and the possibility of replacing the whole 
Hamiltonian by that for a single particle in forming the ensemble would 
no longer exist. 

11. DEVIATIONS OF QUANTITIES FROM THEIR AVERAGE VALUES 

The utility of statistical mechanics for thermodynamics resides in 
the average quantities computed over an ensemble; these are to be 
identified with the physically measured values of these quantities. 
For example, the actually measured energy of a physical system is 
taken to be the canonical average of the energy over the canonical 
ensemble representing the system. On the other hand in the ensemble 
the various elements correspond to different energies and there are 
then actually wide deviations from the average. This situation appears 
rather disconcerting when we are trying to represent a system with con- 
stant energy by a canonical ensemble. Offhand we should be inclined 
to say that it could not be done and that we must represent such a 
system by a microcanonical ensemble. That this is not really always 
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the case, however, is seen from an examination of the magnitude of the 
deviations from the average energy in a canonical ensemble. 

We need not confine our investigation to the energy. Let x be any 
function of the coordinates and momenta of the system. The fluctua- 
tion or deviation of x from its canonical average is represented by 

= X - X- (103) 

(The bar here indicates canonical average without further specifica- 
tion.) The average deviation is then at once 




= (x - x) = 0. (104) 

To get something more significant we introduce the fractional root- 
mean-square deviation or 

ft! [3 __ -2 

(105) 

We shall take a as a measure of the average deviation of x from its 
average value. For the sake of being specific let x H and let us 
confine our comments to an ensemble representing an aggregate of free 
particles, i.e., an ideal gas, where 

H = E = f ' (106) 

As usual / is the number of degrees of freedom of the system. We 
replace 1/0 by z and let / e~ Hz d<S> = Q. Then 

^=- ( He~ Hz d4> =- E-Q 
dz / 

from the definition of canonical average. Moreover 

= E*.Q. 



= f 



We therefore have _ 

i~ij ~ Hj - "~~~ - - \j 1 1 v/ / 1 

^"* /^\ f}*> A(>\ \ / 

Hence 

2 ~ ^ 2 = ^ / = /- (108) 

2 E\d V/ 

Consequently if the number of degrees of freedom of the system is 
very large, as it will be in the case of a gas, a becomes negligibly small 
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and we are after all justified in using a canonical ensemble to represent 
the system. 

There is another way of looking at the matter which brings out the 
same point. We may compute the fractional number of elements in a 
canonical ensemble having energies lying in the range from E to 
E + AE, where E is any energy value. This fractional number will 
clearly be 



A TIT / 

= -/ty. (109) 

JE 



N 



In this expression e^~ H)/Q is to be integrated over that portion of phase 
space for which the Hamiltonian H lies between E and E + AE. Now 
if we restrict the discussion to an ideal gas of n particles of mass m, 
H has the form (/ = number of degrees of freedom = 3n) 



2. fcf <110) 

In evaluating (109) it will be convenient to write d<j> = d<t> q d(t> p where 
d<l> q is the configuration coordinate part of the phase space volume 
element and d<f> p is the momentum part. Then 



A AT /-E + &E 

M = T //3 / e ( *- H)/ d<t> p , (111) 

jtv Jjz 

where r = the physical volume occupied by the gas and / d<t> q = / /3 . 

Now we shall let the volume in momentum space occupied by the points 
representing systems whose energies lie between and E be <J> P , where 
then 



<fyp. (112) 

It is clear that $ p is some function of E, and we must now find out 
what this function is. Let us imagine first that E = 1 (unit of energy). 
The upper limit in (112) is then really equivalent to the relation 



On the other hand if E = a 2 , the upper limit in (112) is equivalent to 

a 2 . (114) 
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It is clear we can write (114) in the form 

ilXa) 2 = 1 - (U4 '> 

y. 1 

Consequently the integral 



in which the lower limits are zero in each case and the upper limits 
subject only to (1 14') is the same as the integral 



-I = J ' ' ' J 



dpi dp 2 dp f , 

in which the lower limits are zero and the upper ones subject only to 
(113). But the integral for which the energy R has the value a 2 is 
given by 



where the upper limits are subject to (114'). Therefore we conclude 
that 

(* p ), = E" 2 ^,)*.!. (115) 

3> p must then vary directly as //2 , since ($ p )jsri is a constant. This 
constant may be evaluated as follows. Let 

3> p = CE' /2 . (116) 

Going back to (111) we see that we can replace d<j> p by d$ p =* 
f/2-CE f/2 ~ l dE. Moreover if we integrate (111) between = and 
E = oo we get unity. That is 



* f* 



We recall that 



Therefore (117) becomes 
fC 
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From the definition of the gamma function there results 
= (2)'" _ (2^,r . 

r(//2 + i) v ' 



We may now express AN/N as follows 

~N~ = (27T W 0) //2 ' 2 ' F(//2 + 1) ' J E 
which reduces for A sufficiently small compared with E to the form 

V = ^wrw ' A> (121) 

Incidentally the reader may check this final expression by observing 
that the summation of (121) over all E, gives unity in conformity with 
the fact that 

SAAT 



N 



= 1. 



Let us now find the "most popular" value of E, i.e., that for which 
! is a maximum. For this value of E there are in a given energy 
interval more elements of the ensemble than in any other interval of 
the same size. We differentiate AN/AE with respect to E and on 
putting the result equal to zero obtain 

\ 

>. (122) 

If / is very large this is effectively //2 @ or the canonical average (106). 
For a system with a very large number of degrees of freedom the 
canonical average energy is also the "most popular" value of the 
energy. This again checks our previous conclusion that for / large we 
do not make a very great mistake in using the canonical average E 
for the energy of the system represented by a canonical ensemble. 

There remains only the task of finding the number of systems hav- 
ing energy slightly different from E mp . Thus, for example, suppose 
E = 1.01 E mp = 1.01 (//2 - 1)0. It is left to the reader to carry out 
the straightforward arithmetical calculation showing that for 1 cm 3 
of gas under standard conditions with/ = 3(2.7) X 10 19 

_.. -1.8X10U 

C/ 

This shows that the chance of picking out in the canonical ensemble an 
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element corresponding to energy differing by only 1 per cent from the 
most popular value is practically negligible. The corresponding fluc- 
tuation of the energy of the actual system represented by the ensemble 
from E is therefore also negligibly small. All this of course is predi- 
cated on the large value of /. 

12. THE STATISTICAL MECHANICAL INTERPRETATION OF 
TEMPERATURE 

We have now seen the possibility of representing thermodynamical 
quantities by means of averages over Gibbsian ensembles and have 
learned that the choice of ensemble, i.e., whether canonical or micro- 
canonical, is largely a matter of mathematical convenience. In order to 
make the connection between statistical mechanics and thermodynam- 
ics more definite and convincing it is necessary to establish more 
precisely the statistical mechanical analogues of the fundamental 
thermodynamical state variables, temperature and entropy. As a 
matter of fact in Sec. 10 in the discussion of the Maxwell-Boltzmann 
distribution law we have had an intimation that the analogue of 
temperature is to be found in the characteristic ensemble parameter @. 
We now wish to place this association on a somewhat firmer basis by 
showing that possesses certain important properties which are pre- 
cisely those of the temperature in the thermodynamical sense. We 
recall that when two thermodynamical systems are in equilibrium the 
condition that they have the same temperature is that when they are 
put in thermal contact (and of course isolated from all other systems) 
the composite system thus formed will be in equilibrium at the same 
temperature as that of either system before contact. On the other 
hand two systems in equilibrium at different temperatures will not be 
in equilibrium when joined, though they will, of course, approach 
equilibrium with a change in the temperature of both. 

Let the Hamiltonians of the two systems be H A and H B respec- 
tively. The phase probabilities in the canonical ensembles representing 
the two systems will then be respectively 



(123) 
P B d<t> B = e*- H *"*d<l> B , (124) 

we have assumed the same parameter @ for both ensembles. 
Now let the systems be joined. The Hamiltonian for the composite 
system will have the form 

H = HA + HB + H ABt (125) 
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where the term H AB refers to the energy of interaction of the two sys- 
tems. We shall suppose that this energy is very small compared with 
either H A or H B , though there are indeed practical cases in which this 
is by no means true ; these we shall rule out of our present discussion 
and proceed to call H AB zero, effectively. 

Now P A d<t> A denotes the probability of picking at random from 
the ensemble corresponding to system A an element whose phase 
lies in d<t> A , and similarly for B. Let the phase space for A have 2k 
dimensions and that for B have 21 dimensions. Then the phase space 
for the composite system will have 2k + 21 dimensions and the element 
of volume in it will be denoted by d<t> A d<t> B . The probability of picking 
at random from the ensemble of the composite system an element whose 
A component has its phase in d<j> A and whose B component has at the 
same time its phase in d$ B is, from elementary probability considera- 
tions, 

P A P B d<t> A d<t> B = e ( *A++*- H A- H *"d<t> A d<t> B . (126) 

In other words we think of the combination of the two systems as tak- 
ing place in such a way that each element of the composite ensemble is 
made up by joining two elements of the original ensembles for A and B 
respectively. But the probability of picking out of the composite 
ensemble an element with phase in d$ = d<t> A d(t> B of course is just 

Pfy = e ( +-V /s d<j>. (127) 

Now \f/ = \l/ A + \I/ B since the \l/'s are constants and no matter what 
precise relation exists between P A P B d<t> A d<t) B and Pd<f>, we must have 
in both cases 



J P A P B d<t> A d<t> B = 1 =JPd<t>. 



(128) 



But from the further fact that H = H A + H B in the ideal case of 
negligible interaction it follows that actually 

P A P B d<t> A d<t> B = Pd<t>, (129) 

which means that when the two original systems having common 
are joined the resulting system is described by an ensemble character- 
ized by the same . On the other hand if the ensemble for A had 
been characterized by the parameter ^ and that for B by a different 
parameter B , after joining we should have 

P A p B d( t>A d <t>B = e ( +A- H A }/ *Ae ( *B- H BV*Bd<t> A d<t> B . (130) 

This would not correspond with Pd$ for any and hence no canonical 
ensemble could be formed for the resulting system, i.e., it could not 
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be a system in equilibrium in the theory of statistical mechanics. 
However we must suppose that after a sufficiently long time has 
elapsed the composite system will come to equilibrium by some kind 
of energy exchange between the two original systems. It can then be 
shown that the modulus of the resulting ensemble has a value lying 
between ^ and #. Indeed by the use of the equipartition principle 
it develops that is related to ^ and @# in precisely the same way in 
which the final equilibrium temperature of a mixture of two gases is 
related to the original temperatures of the components. 

It is clear that possesses the chief properties that characterize 
the thermodynamical temperature T. The precise relation between 
and r, i.e., the nature of the constant c in the relation 

= cT, 

can be obtained only from a comparison between an empirical thermo- 
dynamical law and the statistical mechanical analogue. If we use the 
equation of state (eq. 14 of Chapter III) for this purpose it is not 
difficult to show that c = k, i.e., that 

= kT, (131) 

which indicates that the ensemble modulus in the Gibbs statistical 
mechanics plays the same role as the canonical distribution modulus 
in the classical Maxwell-Boltzmann statistics (cf . Section 3 of Chapter 
IV). 

13. THE STATISTICAL MECHANICAL INTERPRETATION OF ENTROPY 

Closely associated with and fully as fundamental as the interpreta- 
tion of temperature in statistical mechanics is the meaning of entropy. 
The problem is to find a statistical quantity which possesses the 
properties associated with entropy in thermodynamics (Chapter III, 
Sec. 2). Following Gibbs we again begin with a canonical ensemble in 
which 



-*/ = / e- a/9 d4>. (83) 

Now the Hamiltonian of the system is a function of the p's and q's 
which also involves certain parameters. These we shall label 1, 2 * * * / 
. Their number depends on the type of system and the number 
of degrees of freedom. In the simple harmonic oscillator, for example, 
there are two, viz., the mass and stiffness of the system. (Cf. Sec. 3 
of Chapter IV for the introduction of these external parameters in the 
Maxwell-Boltzmann statistics.) Let us now see what happens when 
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the 's and & are varied. This variation has no reference to change in 
time and does not entail changes in the p's and q's. Equation (83) then 
yields 



(,32) 



- I /"-* - 1 



Multiplication by e* /B and the replacement of / He^ H)/e d<t> by 

H yields 

& (133) 





If we denote (\p H}/@ by the new symbol 77, so that (\f/ //)/ = 77, 
we have from (133) 

<r-^//l7T\ 

& (134) 

Now if in the defining equation for 77 we vary both coordinates and 
parameters, denoting such changes by differentials to distinguish 
them from changes in which parameter alterations alone are involved, 
the result is __ 

dj - dH = yd + @dij. (135) 

But since $ and are independent of the coordinates 

d\{/ = 8\f/ and d = d. 
Comparison of (134) and (135) then leads to 



dH 



where we have replaced dH by its equivalent dE, i.e., the change in the 
average energy. Now the last term in (136) represents the average 
change in the energy of the system due to the variation in the external 
parameters. Consequently it can be interpreted as the negative of the 
average work which the system does when the external parameters are 
varied. We shall denote it by 6 W. Therefore (136) becomes 



= dE + dW. (137) 
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If we introduce the relation (131) between and T, this takes the 
form _ 

-kTdij = dE + dW, (138) 

which looks very much like the familiar thermodynamical expression 
(eq. 4 in Chapter III) 

TdS = dE + dW, (139) 

(where indeed we previously used ATFin place of dW with no difference 
in physical meaning). If we decide to let (138) serve as the statistical 
interpretation of (139) we must identify dS with kdrj or set 

S = - JK}, (140) 

with a possible additive constant independent of the coordinates and 
external parameters as well as @ and i/'. This is Gibbs' statistical 
mechanical interpretation of the entropy of a system. 

We ought to look into the analogy between 77 and the entropy a 
little. For example we recall from Chapter III the statement of the 
second law of thermodynamics in the form that the entropy is a maxi- 
mum for a closed system in equilibrium. Can we show that "77 is 
larger for a canonical ensemble than for any other ensemble with the 
same number of elements and the same average energy? We shall 
call 77 = (\l/ //)/ = log P, the index of phase probability. Now let 77 
for the non-canonical ensemble be denoted by 77', where 

77' = 77 + AT?, (141) 

and A77 is an arbitrary function of the p's and <?'s. Because the number 
of elements and average energy are the same in both ensembles, we can 

write 

/r 
e 1t+An d^ = / e*d<t> = 1, (142) 

J 
and 



= f 



Hfd*. (143) 



We seek to prove that 77' > ~ij. This corresponds to ^ > ?/'. 
Though the average r/ is not a canonical average it is calculated over 
its ensemble in the usual way. Thus we wish to show that 

/ (i? + ^)e n+All d<l> > j rje'dt. (144) 



144 CLASSICAL STATISTICAL MECHANICS [CH. VI 

Now 



= I y e' +A '^ - I y He* +A *d<t> +J A,e' +A " 4*. (145) 
By using (142) and (143), we can reduce (145) to 

y (n + Afl)' +A V* = | - jHe'd* +J A,e' +A '^. (146) 



But for the same reason 



d* = I ~ 4 j nf **- 

Therefore to prove (144) we must examine / &rje n+ * rt d<t>, which can 

also be written / (Aye** 1 + 1 e^e^dQ. The parenthetical term in 

the integrand is (A?; l)e Arl + 1. If we plot this as a function of AT;, 
we see that it has only one minimum for real values of A?/. This has 
the value zero and occurs for AT; = 0. For all other values of AT/, the 
term in question is positive and greater than zero. Moreover e n is 
always positive. Therefore 



(the equality sign corresponding to AT; = 0) for the whole range of 
integration. Therefore either AT; vanishes, in which case the two 
ensembles are identical throughout, or 



> 

and (144) is substantiated. Hence 



(148) 



Consequently for a system not in equilibrium and represented by a 
non-canonical ensemble the entropy represented by rj' will tend to 
increase. It is clear that the statistical mechanical analogy klj for 
entropy satisfies the "increasing" property of entropy in thermody- 
namics. 
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Further evidence of the suitability of 7} to represent the entropy 
of a thermodynamical system will be found in the fact that 77 is a 
maximum for a uniform distribution of the elements of the ensemble in 
phase space as compared with any other distribution of the same 
number of elements and corresponding to the same limits of the 
phase. The reader can readily show this by a slight modification of 
the method just used to demonstrate ~r\ > if. In other words 

TJ can be used as a measure of uniformity; the greater the value of 

77 for a given ensemble, the more nearly uniform is the distribution 
of elements. The canonical ensemble is the one which is most nearly 
uniform in distribution of all ensembles with given average energy 
and given number of elements. Obviously the appropriateness with 
which a canonical ensemble can be considered to represent a system in 
equilibrium depends in the last analysis on the extent to which any 
ensemble representing an actual dynamical system tends toward uni- 
formity of distribution with the passage of time, independently of the 
initial distribution. This approach to uniformity cannot be logically 
demonstrated for an arbitrary initial distribution but proofs have been 
given for initial distributions which are not themselves so specialized 
as to be highly improbable. 11 We therefore feel safe in accepting the 
proposition that the Gibbsian analogue of the entropy for a closed 
system tends to increase and practically never decreases. We must 
recognize, of course, that it is subject to the same probability difficulties 
already envisaged in Sec. 3, Chapter IV in the discussion of the Max- 
well-Boltzmann interpretation of entropy. 

One last point remains for consideration: does krj represent a 
state variable? From the definition 



it follows that after the integration has been carried out the only quan- 
tities left on the right-hand side are the parameters, viz., the y, @ and \l/ 
and the physical volume occupied by the system being represented. 
The latter is involved in the integration limits for the g's. The limits 
for them's are +00 and oo effectively. @ = kT. Hence effectively 
klj depends only on the volume and temperature and the other 
parameters which characterize the state. Therefore krj represents a 
state variable and its association with entropy may be considered sub- 
stantially verified. 

11 Cf. Gibbs, "Statistical Mechanics," Chapter XII. For a concrete illustra- 
tion, Lindsay and Margenau, op. cit., p. 245, may be consulted. 
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14. FREE ENERGY AND THE GIBBS PHASE INTEGRAL 

We have so far said little about the parameter ^ which enters into 
the definition of the probability coefficient for a canonical ensemble. 
But eq. (134) written in the form 

d^ = yd - dW, 

with "TI = S/k and d kdT, is in precisely the form of eq. (10) of 
Chapter III, in which d\l/ (there represented as d^f) represents the 
change in the free energy of the system. Consequently we may safely 
treat \p as the statistical mechanical analogue of the free energy. We 
have already commented on the importance of this thermodynamic 
function in the derivation of the equation of state (eq. 12 of Chapter 
III). It is interesting to observe that in the Gibbs statistical mech- 
anics \l/ is immediately given in terms of a certain integral. Thus from 
(83) 



g C 



= - log e~ H/ *d4>. (149) 



It is customary to refer to / e~ H/ d$ as the Gibbs phase integral, 

sometimes denoted by /. Its evaluation as a function of and the 
external parameters of a system therefore leads at once to the equation 
of state. If we compare (149) with eq. (72) of Chapter IV we see that 
the Gibbs phase integral appears to bear some analogy to the function 
Z, which there w r e called the distribution or partition function. Indeed 
the connection looks even closer if we examine again the evaluation of 
Z for a system of free particles in Sec. 4 of Chapter IV. There except 
for a multiplicative constant we actually computed the phase integral. 
(NOTE: It must not be forgotten that ^ in statistical mechanics corre- 
sponds to ^ in Chapter IV.) 

As an illustration of the phase integral we shall calculate it for the 
special case of a system of n simple harmonic oscillators with masses mj 
and stiffness coefficients kj. The Hamiltonian is 

if we suppose that the oscillators are free of mutual interaction. We 
therefore have from (149) 

d^dp l --- dp n , 
(151) 



SEC. 14] FREE ENERGY AND THE GIBBS PHASE INTEGRAL 147 

with the limits of integration taken between oo and + oo for each var- 

. i , T ,, . . , , , dqi dq n dpi dp n 
lable. In this expression we have written d<t> = , 

where h is a quantity having the dimensions of coordinate times 
momentum and is here put in to secure the proper dimensionality (cf. 
the discussion after eq. 83 in Sec. 9). Now 




Consequently 

" /1X (152) 



If the dependence of the frequencies vj on the volume of the system 
were known we could use (152) to determine the equation of state. 
However, we can at any rate get the expression for the entropy. From 
the equipartition of energy we know that 

E = n = nkT. (153) 

This could of course be computed directly from //. Now from \p = 
E TS, we get 



m 



S ~ =nk + k log (kT)< 



= nk + k log 



So far as the Gibbs statistical mechanics is concerned h is just a con- 
stant having the appropriate dimensions to make the fundamental 
expression (151) dimensionally correct. It is clear, however, that 
we can interpret h n as the unit of "Volume" in phase space. As such 
in classical theory it may have any numerical value. According to the 
quantum theory, however, h is a fundamental constant of nature, 
called the Planck constant of action, with the value (cf. Chapter 
VIII) 

h = 6.55 X 10~ 27 erg sec. 

The dimensions of hvj are then those of energy like kT. 

In the classical theory of solids, a solid crystal is considered to be 
effectively a collection of harmonic oscillators. Consequently the 
formulas just derived have an application to an ideal crystalline solid. 
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In particular from (153) we can get the heat capacity at constant vol- 
ume for such a solid. Thus _ 

C v = ^| = nk. (155) 

di 

Here n is the number of degrees of freedom of the crystal. If we think 
of each atom making up the crystalline solid as having three degrees of 
freedom and assume that the number of atoms per gram molecule is 
still Avogadro's number, it follows from (155) that the molar heat 
capacity of a monatomic crystal is 3(6.06) X 10 23 X (1.37) X 10~ 16 
ergs/degree C. When reduced to calories degree C this figure becomes 
5.96 cal/degree C. This is in rather good agreement with the experi- 
mental value for monatomic crystals at room temperature. The simple 
classical theory here presented fails indeed to account for the variation 
of the molar heat capacity with the temperature. For this the quan- 
tum theory seems to be demanded. (Cf. Chapter IX, Sec. 3.) 

It should be possible to use eq. (149) for the free energy to attack 
theoretically the derivation of the equation of state of a real gas. Con- 
siderable progress has been made on this recently but the subject lies 
beyond the scope of the present book. 12 

15. ALTERNATIVE INTERPRETATION OF ENTROPY 

It is well to recognize that ^7 is not the only statistical mechanical 
quantity which possesses the appropriate properties to serve as an 
analogue of entropy in thermodynamics. Consider the quantity 

S = k log fa (156) 

where <t>jjj is the total phase volume enclosed by the energy surface 
H = E y and E is the average energy over a canonical ensemble. Let 
the energy of the dynamical system represented by the ensemble, 
namely E, be changed slightly without altering the external parameters 
1 n . Physically this will correspond to a flow of heat into or out 
of the system. Then from (156) there will be a change in 5 of magni- 
tude 

E, (157) 



where dfyjs/dE is the rate of change of <t>% with respect to E while the 



12 An elaborate discussion will be found in Mayer and Mayer, "Statistical 
Mechanics," John Wiley & Sons, New York, 1940, pp. 277 ff. 
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|'s remain fixed. From eq. (60) d<i> E /dE = w(). Hence 

(158) 



Now we have already shown (eq. 71) that <f /u(E) is twice the average 
kinetic energy per degree of freedom taken over the microcanonical 
ensemble corresponding to total energy E. If the system consists of 
free particles, as we shall here assume for convenience, this means that 



if / is the number of degrees of freedom. But for an aggregate of free 
particles, the equipartition theorem yields 



Therefore (158) becomes 



But since dE must here represent change in energy caused by the 
reversible flow of heat, this is the usual expression for the change 
in entropy in a reversible thermodynamical process. 

Although the above discussion is not very rigorous and is indeed 
rather specialized the suggestion is, at any rate, that k log $% is a 
possible analogue of the entropy. We could proceed to apply to it the 
tests used on Irj. It will be simpler, however, to examine directly its 
relation to 77. We shall restrict the discussion to an aggregate of 
free particles, i.e., an ideal gas. 

We begin with 

-' = e-* /9 / /e . (162) 

But for an ideal gas this becomes 

-' = -*'*</'*. 

We can dispose of the factor e~* /s by recalling that 

(163) 

where the integration is to be conducted over the whole phase space. 
Writing 
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where d<t> q refers to that part of the elementary phase volume in which 
the q's enter and d<j> p the corresponding momentum part, we can further 
say that 



= T //3 C . . . r 

/ 00 (L/QO 



where r is the physical volume occupied by the gas. The /-tuple 
integral in (164) is evaluated in the usual way. The result is 

e -*/* = r //3 (27rw )// 2 . (165) 

The consequence is that 

-? = log r //3 + ~log 27rw0 + Y (166) 

We now wish to compare this with log </>#. This may be written 

/E 
d<t> p . (167) 

_ 

The integral has already been evaluated in eqs. (116) and (120), and 
we can immediately write 

"* 

' (168) 



r(//2 + i) 

This leads to 

log <f>E = log r //3 + ^log 2irm& + ^log | - log r ( + i j. 

Now the asymptotic expansion of log F ( - + 1 ) for large positive / 
can be put into the form 13 

log r (| + 1) = ({ + ^) log ^ - (^ + i) + log \/27 

+ terms involving - in the denominator. 

fL 

Neglecting terms in this expansion small compared with //2 we see 
that log 4>E can be expressed in the form 



log <t>i = log r //3 + log 27rm0 - log V^/ + (169) 

If now we compare rj in (166) with the asymptotic form of log </>^ 
13 Whittaker and Watson, "Modern Analysis" fourth edition, Macmillan, 1928. 
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in (169) we see that they differ only in the term log \/irf which is very 
small compared with the other terms in f if f is large. Hence as / 
increases we have the asymptotic relation 

-q~log0*. (170) 

This indicates the important relation between the two definitions of 
entropy. 

16. 



This concludes our survey of the statistical mechanics of Gibbs 
which will stand for a long time as a monument of the power of abstract 
thought over physical problems. At the risk of a certain amount of 
redundance it will be desirable to sum up its features as contrasted to 
those of the Maxwell-Boltzmann method of statistical distributions. 

We recall in the first place that the statistical distribution method 
operates throughout with the actual system being discussed. Thus, 
for example, we have a certain set of independent particles with a 
certain property, e.g., energy, and inquire about the most probable 
distribution of the particles with respect to this property. This most 
probable distribution is, of course, arbitrarily defined, but the defini- 
tion in terms of the number of ways of realizing each state of the sys- 
tem, is at any rate plausible. This process leads to the so-called 
canonical distribution. We seek to identify the parameters entering 
into the distribution, i.e., ^, 0, and log P c or log w, with observed 
properties of the system being described. This is done by showing 
that relations satisfied by these quantities are mathematically of the 
same form as the important thermodynamical relations among the 
state variables of the system in the thermodynamical mode of descrip- 
tion. This provides one statistical interpretation of the macroscopic, 
thermodynamical properties of physical systems. The scheme has been 
criticized from several points of view, notably because of the use of 
Stirling's formula in its mathematical development to evaluate N ! no 
matter what the value of N is. This has led to an alternative formula- 
tion, namely the method of Darwin and Fowler, which we shall discuss 
in the following chapter. 

The method of Gibbs does not operate with the actual system being 
discussed. Rather it builds an abstract ensemble to represent the 
behavior of the system. The subsequent analysis is carried out wholly 
with the ensemble and the connection with the properties of the actual 
system is made solely through the fundamental postulate of Sec. 6. 
Thus averages taken over the ensemble are conceived to represent 
observed values of the corresponding quantities for the actual system. 
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This places extreme importance on the choice of ensemble. Gibbs's 
choice of the canonical ensemble (he made little use of the micro- 
canonical ensemble) must be considered a stroke of genius. Operating 
with it, he was able to provide an analogy for all the thermodynamics 
which was known at his time. It is interesting to observe that he made 
no effort to build a model mechanism for the physical system under 
consideration. The canonical ensemble is in no sense a model. It is 
an abstract fiction, having no physical existence. Of course, it is true 
that to calculate averages over a canonical ensemble, one needs the 
Hamiltonian function of the system being described; this marks 
Gibbs's scheme as lying wholly within the framework of classical 
mechanics. The Maxwell-Boltzmann method is not subject to this 
particular restriction as it can envisage the distribution of any set of 
entities whatever. In one sense, therefore, the Maxwell-Boltzmann 
method is more general than that of Gibbs. On the other hand the 
Maxwell-Boltzmann statistics is more specialized in the sense that it 
operates only with free particles and thus neglects the possibility of 
their mutual interactions. The attempt to apply the Maxwell- 
Boltzmann statistics to a real gas, for example, necessitates the 
introduction of mechanical concepts and the postulation of forces 
lying outside the framework of the method itself. Indeed it involves 
essentially the application of some kind of kinetic theory. The method 
of Gibbs, on the contrary, is general enough to include all sorts of 
dynamical systems within its scope. 

The comparison between the two types of statistical method will 
hardly be complete, however, without an exposition of the Darwin 
and Fowler modification of the Maxwell-Boltzmann scheme. This 
will form the content of the following chapter. 

PROBLEMS 

1. Apply Euler's theorem on homogeneous functions to deduce eq. (22). 

2. Write the expression for the kinetic energy in terms of spherical, cylindrical 
and paraboloidal coordinates. Evaluate the component conjugate momenta in 
each case and comment on their physical significance. 

3. Prove that the canonical equations are invariant in form with respect to an 
arbitrary point transformation of generalized coordinates. 

4. A particle revolves in a circle about a fixed axis. Plot the representative curve 
in phase space. 

5. A particle moves along a straight line in a uniform field of force. Plot the 
representative curve in phase space under the assumption that the particle is not 
allowed to exceed a certain maximum velocity. 

6. A particle moves along a straight line in a field of force directed toward a 
xed point on the line and varying inversely as the square of the distance from the 
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point. Plot the representative curve in phase space under the assumption that the 
particle is confined to a segment of length a on either side of the fixed point. Assume 
that the total energy is negative. 

r 

7. In the case of the simple harmonic oscillator form the integral (p pdq, which 



is supposed to be taken over a whole phase curve. This is called the phase integral. 
What allowed energies of the oscillator correspond to equating the phase integral 
to nh, where h is Planck's constant of action and n is integral? Do the same problem 
for the simple rotator mentioned in Problem 4 above. 

8. Find the expression for the average energy over the canonical ensemble cor- 
responding to a simple rotator. Do the same for a microcanonical ensemble. 

9. Discuss the phase space for inverse square central field motion in terms of 
spherical coordinates. N.B. This is, of course, four-dimensional, but since in 
central field motion p B is constant, one can adequately represent the situation by a 
three-dimensional phase space, by employing r, 0, and p r as coordinates. Plot the 
surface of constant energy in this space. Indicate the phase curve on this surface. 

10. Use Gibbs's method to show how Liouville's theorem maybe used to prove the 
in variance of the element of volume in phase space with respect to an arbitrary point 
transformation. 

11. A particle moves in an inverse square central force field in an elliptical orbit 
with the force center at one focus. Find the expression for the probability that the 
particle will be found with its radius vector lying in the interval r, r + dr. 

12. Prove that the canonical average of the energy of the simple harmonic oscil- 
lator is equal to the modulus by evaluating 

dq 



r * r e 

J-OQ J 



13. Calculate the root-mean-square deviation of the energy from the canonical 
mean for an aggregate of TV simple harmonic oscillators. 

14. Evaluate the momentum space volume & p in eq. (112) directly by the use of 

/00 

the gamma function T(n) = / x n ~ 1 e~ x dx and the beta function B(m, n) = 

f x m ~ l (l - x) n ~ l dx. (Cf. E. B. Wilson, "Advanced Calculus," p. 378. Ginn 
Jo 
and Co., 1912.) 

15. Evaluate -* - (Sec. 11) for a system consisting of a single particle with 



three degrees of freedom where E 1.01 mp . Do the same for a system of 100 
particles. 

16. Prove that fj (Sec. 13) is a maximum for a uniform distribution of the ele- 
ments of an ensemble in phase space as comp ared with any other distribution of the 
same number of elements and corresponding to the same value of the average energy. 



CHAPTER VII 

STATISTICAL MECHANICS BY THE METHOD OF 
DARWIN AND FOWLER 

1. FUNDAMENTAL CONCEPTS 

The basic statistical concept in Gibbs's statistical mechanics is the 
ensemble. This is abandoned in the more recently developed method 
of Darwin and Fowler 1 and we must understand clearly the significance 
of the change. Gibbs considers a system of / degrees of freedom and 
proceeds to construct an ensemble whose elements are exact copies of 
the system in all its possible phases. On the other hand, the Darwin- 
Fowler method visualizes the system in question as made up of a 
large number of independent constituent systems, e.g., N in number. 
In their notation the N constituents together form an assembly of 
systems. At any instant the state of the assembly depends on the 
states of its constituent systems and quantities representing properties 
of the assembly are averaged over all possible states of the assembly. 
For a concrete example, suppose that a perfect gas consisting of N 
free mass particles is to be described statistically. Each particle will 
be considered a constituent system of the assembly, the latter repre- 
senting the gas as a whole. The state of the gas depends on the state 
of the constituent particles and since each particle is capable of existing 
in various states characterized by different position, momentum, 
energy, etc., the corresponding states of the gas, i.e., the assembly, 
are very varied. 

It is evidently necessary to make clearer what we shall mean by a 
state. Let us suppose that the property in which we are interested is 
the energy. The state of the assembly would be ideally specified by 
stating the precise energy value of each constituent system. Since in 
practical applications the constituent systems are very numerous and 
since in general they are indistinguishable, e.g., all particles of the 
same nature, this mode of specification is impracticable. We there- 
fore fall back on the specification of the number of constituent systems 
in each allowed energy interval. Effectively the situation is like that 
envisaged in Sec. 1 of Chapter IV, where we considered the distribu- 

*R. H. Fowler, "Statistical Mechanics," second edition, 1936. Cambridge 
University Press. The original joint papers date from 1922. 

154 
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tion of N indistinguishable objects among JJL boxes. The N objects 
constitute the assembly and the ju boxes are possible or allowed energy 
values. As far as classical statistics is concerned these energy boxes 
may be of arbitrarily small size and continuously distributed. Accord- 
ing to the quantum mechanical point of view, however, the allowed 
energy values may be discrete, corresponding to a lower limit to the 
size of the boxes. Since we shall shortly have occasion to apply statis- 
tics to problems treated by the quantum theory we shall take from the 
start the general point of view. 

It is well to emphasize that, like the Maxwell-Boltzmann method, 
the Darwin-Fowler statistics in the form presented here operates with 
independent constituent systems. It is only for these that one can 
talk of the number of systems in each allowed energy interval, etc. 
Any attempt to generalize the method to apply to interacting con- 
stituent systems would appear to necessitate the introduction of 
Gibbsian ensembles (cf. Sec. 6 of this chapter). 

As in Chapter IV we shall assume that the individual possible 
energy states or boxes have certain a priori probabilities or elementary 
weights associated with them which we shall designate as ,i ; 
We generalize the earlier discussion by refraining from setting a limit 
to the number of possible states. In the discussion of quantum 
statistics (Chapter VIII) it will develop that the elementary weights 
of the energy states of a quantum mechanical system are always 
integers. On the other hand, in classical statistics the elementary 
weight associated with the element d(f> in phase space is d<f)/h n , where 
h n merely represents the unit volume in phase space and h is Planck's 
action constant. The inclusion of the divisor h n is to secure the 
necessary non-dimensionality in the weight. (Cf . Sec. 14, Chapter VI.) 

We are now ready to write the expression for the probability or, as 
we shall now call it, the weight to be associated with that state of the 
assembly in which, of the N constituent systems, NQ & re m the energy 
state EQ, NI in E\, - Nj in Ej, etc. This follows immediately from 
eq. (6) of Chapter IV with appropriate change in terminology. If 
the weight in question is denoted by W, we have 



- ' 



where 
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the fixed total number of constituent systems in the assembly, and 

fNj = E, (3) 

the fixed total energy of the assembly. 

Now we want our statistics to give us the average number of con- 
stituent systems in any particular energy state, say that corresponding 
to E r , subject to the conditions (2)_and (3). Evidently to get this 
number, which we shall denote by N r , we must first multiply N r by 
the weight W and sum over all sets of numbers NQ, NI Nj - 
satisfying conditions (2) and (3). Then we must divide the result by 
the sum of all the W's consistent with (2) and (3). In abbreviated 
symbolical form 



To grasp the significance of the method it is essential to understand 
the meaning of the sums in the expression for N r . Going back to (1) 
we see that a value of W corresponds to each mode of distribution of 
the N constituent systems over the energy states. To get 2W we 
must add all these various values of W for all possible modes of dis- 
tribution consistent with (2) and (3). This gives the denominator in 
(4). The numerator is obtained likewise, only before summing we 
multiply each value of W by the N r which is appropriate to that value. 
The remaining problem now is the mathematical evaluation of these 
summations to express the value of N r as a function of the elementary 
weights and the energy E r associated with the rth state. We shall 
discuss this in the next section. 

At this place, however, we ought to notice the difference between 
the method of procedure here and that followed in Chapter IV in the 
treatment of statistical distributions. There we made W (or its equiv- 
alent P^) a maximum subject to conditions equivalent to (2) and (3). 
This resulted in the so-called canonical distribution (eq. (27) of Chap- 
ter IV) ; it was assumed that the number of systems associated with 
a particular energy value in this distribution would correspond to 
the observed number when the system was in a state of equilibrium. 
Wo then used this number to compute average values, e.g., the total 
energy under various assumptions as to the possible energy values. 
The Darwin-Fowler method proceeds differently; average values, e.g., 
N r , are calculated directly without the necessity for maximizing W. 
We shall find to be sure, that these average values will agree with 
the corresponding distribution formulas of the Maxwell-Boltzmann 
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statistics as well as the averages calculated on the basis of canonical 
ensembles in Gibbs' statistical mechanics. Nevertheless the variation 
in the mathematical machinery used provides an interesting check on 
the other ways of looking at the problem and certain questionable 
approximations, e.g., the wide use of Stirling's formula for NI in the 
analysis of Chapter IV can be avoided. In this way our confidence in 
the statistical point of view will be strengthened. Moreover the 
method of Darwin and Fowler provides a very natural introduction 
to quantum statistics, used for most contemporary problems in statis- 
tical physics. 

2. EVALUATION OF AVERAGES 

Our next problem is the purely mathematical one of evaluating 
the sums in (4). First consider 



We attack this by means of the multinomial expansion. First recall 
the binomial expansion 



where JV + NI = N and the sum is taken over all NQ and NI satis- 
fying this restriction. By a simple generalization 



= y^ 



where the summation is taken over all JV , NI - N r , - - , etc., satis- 
fying the condition S^V/ = N. The connection between (7) and (5) is 
obvious, but we have still to introduce the g's and the condition 
2NjEj = E. We do this by taking the expansion 



' ,,#oJVl -Nr ^NjEj fo\ 

T , A7 . ^T] ' go l ' ' * r ' ' ' 2 (o) 

V o I iV i I IV r ! 

If no further restriction is placed on the Nj beyond that implied in 
eq. (2) the sum on the right of (8) contains all powers of z given by 
gZNjE^ j now we W j g j 1 to rcstr j ct t j le ffj further by the condition (3), 

where E is a constant, the only terms in the sum on the right of (8) 
which interest us are those in which z is raised to the power E. The 
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sum of the coefficients of all these terms is then our 2W. It follows that 
SW is equal to the coefficient of Z E in the polynomial expansion in 
eq. (8). If we can find some way of evaluating this coefficient we 
shall have found the desired expression for SW without employing 
Stirling's formula. 

Consider the contour integral in the complex plane 

z n dz, 
c 

where z is the complex variable x + iy = r (cos + i sin 0) and the 
path of integration C is a circle of radius r about the origin. More- 
over n is an integer. This integral may be evaluated by means of real 
integrals by transforming to the equivalent polar coordinates. Thus 
dz = r( sin + i cos 0)d0, whence 

/ z n dz = ir n + l I (cos + i sin 0) n+1 </0. 
Jc */o 

By utilizing De Moivre's theorem 

(cos + i sin 0) n = cos nO + i sin n0, (9) 

we have 

/ z n dz = ir n+1 I [cos (n + 1)0 + i sin (n + l)0]dO. 
Jc J 

But 

/cos (n + \}0de = I sin (n + 1)0^0 = for w 5* - 1, 
*/o 
while 

/cos (n + l)6dO = 2?r for n = 1 and / sin (n + I)6d6 
JQ 

= for n = 1. Hence we reach the general conclusion that 

/z n dz = 2iri for n = 1, 
_ 

= for n ^- 1. (10) 

It follows that if we integrate (2gjZ Ej ) N about Cand divide by 2iri we 
shall obtain the coefficient of z~ l in the expansion. This of course 
assumes that all the exponents in the expansion are integral. To 
assure this all we need do is to choose our unit of energy so small that 
effectively *2NjEj can always be represented to a sufficiently high 
degree of accuracy as an integer. We want, however, the coefficient 
of Z E . We must therefore integrate (2gjZ E] ') N /z E+l about C to get 
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the coefficient of Z E in the expansion (%gjZ Ei ) N . Let us call the sum 
in the parenthesis for convenience, Z, viz., 

Z = Vgpfi. (11) 

We shall follow Darwin and Fowler in referring to this as the " par- 
tition function/' 2 In this notation, then 



We must next find the numerator in the expression for N r . On mul- 
tiplying the summand of (5) by N r and then cancelling it in numer- 
ator and denominator, we obtain 



N l. 



"r+l 



where the summation is still to be conducted over all NQ, N\ j?V r 
consistent with conditions (2) and (3). Let us now introduce a new 
set of numbers M , MI, - - Af r , defined by 

M e = N Q , M 1 = N l9 Mr = N r - 1, M r+l = N r + lt - ... 
Then 2N r W may be written 



where the summation is now conducted over all MQ, M r con- 
sistent with the conditions SJkfj = N 1 and SAf/Ey = E E r . 
The previous analysis now indicates that HN r W is Ng r times the coeffi- 
cient of Z E " ET in the expansion of (Zgys* 1 ')*" 1 - Hence 



and the average quantity we desire is 



/rjN J 
z* dz 
z 



'C 

The evaluation of the integrals in the numerator and denominator of 
2 In Fowler, op. cit. t p. 38, f(z) is used to denote the partition function. 
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N r is carried out by the method of steepest descents, which will now 
be described. 3 

We consider the function 

Z(z) 



which can be written in the alternative form 

/_ i _ n &\~ EQ [ _ fv Ez~' EQ | | _ f JSf~~EQ \ \ 

, / \ __ UP ~T glZ ~t" g2 s "T ' ' ' "T gr% ~T~ ' ' ' ) 

$(*) "~ (N /N-l)E +Ni/N':i+-'+Nr/N'Er+-- 

It is clear that we can split <j>(z) into two sets of terms, one a series of 
negative powers of z and the other a series of positive powers. Thus 




We recall that the g's are all positive integers as are the exponents 
kj and li. It will be noted that we can always arrange our zero energy 
level so that all the Ej are positive and increase monotonically with j. 
Consider the behavior of <j>(z) on the real axis. At z 0, <j>(z) is cer- 
tainly infinite owing to the presence of the first term in (18). At 
2=1, <p(z) is again infinite because the second term diverges for z = 1 

00 

(being / j gi.) Let us now differentiate <t>(z). We get 

1 - <"> 



Now as z goes from to 1, both terms on the right side of (19) increase 
monotonically. This is shown diagrammatically in Fig. 7 1 where 
A is a schematic representation of the plot of S/^s**" 1 for real values 
of z between and 1, while B represents "2kjgj/z k * +l in the same 
interval. A schematic plot of <t>(z) is also included. There is only one 
value of z between and 1 for which d<t>(z)/dz 0, and here </>(z) 
will have a minimum. Let us call this value z = f . Going back to 
(17) we have then f^/dz_EZ/N\ 

\ Z E/N a**~ ' 
N[(dZ/dz) t 



which acts as a defining equation for f . 

8 The method was apparently first described by P. Debye, Math. Ann., 67, 535 
(1909); Munch. Sitzungsberichte, 40 (1910). See also E. T. Copson, "Theory of 
Functions of a Complex Variable," p. 330. Oxford, 1935. 
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Now consider the circle in the complex plane with center at the 
origin passing through the point z = f . This circle is represented by 
the equation z e la , where a is the angle about the origin measured 
from the real axis. We seek the behavior of [<t>(z)] N = [Z(z)] N /z B on 
this circle, where 

^]". (21) 



If we write out the sum in the bracket in (21) it will have both a real 
and imaginary part. The absolute value of this resulting complex 
expression becomes 

I Z(z) | = 



cos a 



j - E k ). 

For a 7* 0, | Z(z) \ is less than its value for a = 0, unless for all the 
values of Ej there exists the relation 



a(Ej - E k ) = 27m (22) 

for all j and &, n being any integer. Consequently if the condition 
(22) is not fulfilled, the abso- 
lute value of Z(z) will be 
greater at z = f than for any 
other point on the circle 
z = fe ta . Therefore if N is 
large, [Z(z)] N will have a 
strong maximum at this 
point. For this reason z = f 
is known as a " saddle " 
point, since whereas it corre- 
sponds to a minimum on the 
real axis, as we go away in 
either direction from the real 
axis the function falls away 
very steeply, the steepness 
increasing as N increases. We 
shall suppose that N is so 
large that the value of [Z(z)] N 
and hence the value of [<t>(z)] N 
at any point on the circle save 
the saddle point is negligible. Fie. 7 1 

If this circle is chosen for the 

contour C, the value of the integral in the numerator of (16) becomes 
effectively 

Z B '\Z N dz . Ng^ r Z N , 

(23) 
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since the value of z Er /Z will have no appreciable effect on the integral 
save for z f . This enables us to express the average value N r at 
once in the form 



C Z N 

/ -g+i 
Jc * 



without the necessity of calculating / -g+i dz. It should be noted 

Jc * 

that strictly speaking what we have said above about the saddle 
point corresponding to much larger values of [Z(z)] N on the real axis 
than for points just off the axis also applies as well to other points on 
the real axis for < z < 1. However from function theory it follows 
that the result of the integration is independent of the precise contour 
around the origin. Hence we are at liberty to pick that contour for 
which the integration is simplest. This proves to be the case for the 
contour passing through the saddle point. It is usually assumed that 
the descent from the saddle point is steepest (cf. Fowler, op. cit., 
p. 36) and indeed this seems qualitatively to be the case, though we 
shall not endeavor to give a proof of it here. The_steepness will 
naturally improve the accuracy of the evaluation of N r by means of 
contour integration. 

We have still to examine what happens when eq. (22) is satisfied. 
It will be noted that this will be true only if all the differences Ej Ejc 
have an integral common factor, say /. For then (22) will hold for 
a/2w = I//, 2/1 (I I)//. Consequently instead of only one 
maximum point on the contour circle there will be / such points. 
The result is to multiply the integral in (23) by /, but since numerator 
and denominator are multiplied by the same quantity, the value of 
N r given in (24) is not affected. 

The contour integral for SW will be of importance in the subse- 
quent analysis and we shall therefore evaluate it here. Let us expand 
log [<t>(z)] N in a Taylor series about the point z = and along the 
circular contour. We have 

log [*(*)]* 



But at z = f , <t>(z) has its minimum and therefore the coefficient of 
z f vanishes. Moreover if z f is small enough we can replace it 
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by ia.% to a sufficiently good approximation. Hence we have 

lo (z)]" = N\\o - - V - 

I 2 

or 

^ f 2 a 2 __ / <p^ij 

Incidentally this serves to exemplify the large rate of decrease of 
[<t>(z)] N for N very large as one passes from z = f to some nearby 
value. 

We now have, writing dz = iz da, 




where we have availed ourselves of the possibility of conducting the 
integration about any convenient path which passes through z = f 
(where a. = 0) since the integrand is extremely small save at this 
point. The limits are chosen as + and <*> for simplicity also. 
The result of the integration is 



J_ f 

2iri J c 



Z E + 1 * " 

\i2wNt* ~ "; o/ / 0(f) 
^ dr / 

[7( ,^N, E ^ ^ 



Let us return to the formula (24) which gives the average number 
of systems in the assembly in the^nergy state E r . It is the distribu- 
tion law. Evidently if we sum N r over all r we must get N. This 
condition indeed is satisfied by (24) since 



X 



_ 

= 



^gf 



The presence of the parameter f and the partition function Z(f ) makes 
the distribution law appear a little strange, especially as it was said 
above that the Darwin-Fowler method leads to essentially the same 
result as the canonical distribution in the Maxwell-Boltzmann statis- 
tics. However we recall that the latter involves the parameter & 
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which we have found it necessary to interpret as kT in order to estab- 
lish connection with experimental results. If we now set 



(30) 
(24) becomes 



'm precise agreement with the canonical distribution (27) or (27') of 
Chapter IV (making allowance for changes in notation). We shall 
shortly present independent reasons for making the assumption (30). 
To do this we shall find it necessary to discuss assemblies consisting 
of more than one kind of system. 

3. ASSEMBLY OF TWO KINDS OF SYSTEMS 

In place of an assembly consisting of systems all of which are of 
the same type, let us imagine an assembly in which there are two 
kinds of systems. The analysis can be readily generalized to any 
number of types, but we shall find it convenient for the sake of sim- 
plicity of notation to confine our attention to two. 

Let now the number of systems of the two kinds be NI and N 2 
respectively, where 

NI + N 2 = N. (32) 

Of the NI systems of the first type, let NIQ, NU, NI%, , Ni r , be 
the numbers in the energy states 10, n, , -Ein respectively. 
Of the N 2 systems of the second type, let N 2 o, N 2 i, N 22 , , N 2r , - 
reside in the energy states 20* ^2i> * * > E 2rj respectively. Note 
the necessary change in notation from Sec. 2 in order to denote 
adequately the two types of systems. The a priori weights attached 
to the energy states of the first type of system are now gi , 11, 12, ' ' ' > 
g\rj - - and those for the second type of system g 2 o #2i> #2r 
Clearly we have the relations 



(33) 

2j E 2j = E. 

Here R is the total energy of the assembly. The eqs. (33) replace 
the two equations (2) and (3). 

Our first task is to express the weight to be associated with the 
state of the assembly in which NIQ systems of the first type are in 
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the energy state 10 , NU in EH, , N 2 o in E 2 o, etc. From the 
analysis of Sec. 1 this weight is expressible in the form 



(34) 



The average number of systems of the first kind in the energy state 
Ei r is then N ir and its value is given by 



_ 

The procedure for the evaluation of N\ r follows that employed in 
Sec. 2. Thus we use the polynomial expansions: 



(36) 



> 
The product of (36) and (37) is 






n 



08) 



By precisely the same reasoning as in Sec. 2, it follows that the denom- 
inator in the expression for N r , viz., 2W, is the coefficient of Z E in 
the expansion (38). Moreover the numerator can be handled in simi- 
lar fashion. Let us write 



ti = Z 2 . (39) 

The result is that 



C **,* z ? lz P 
^ Jc Zt 2 ' -?*- 



"" - rzw (40) 

/ I E+I dz 

m/C* % 
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The application of the method of steepest descents with 

[$(z)] N = 1 ^ 2 ' , (41) 

z 

yields finally *r 



' (42) 

where f is the value of z which makes $(z) a minimum on the real 
axis. Similarly 

9 (42') 



with the same f, since there is again only one minimum for 3>(z) on 
the real axis between and 1. Since 

= for z = f , 



dz 
we get the relation 

E = NJ ~ log Z l + N 2 { ~ log Z 2 . (43) 

Each term on the right of (43) is of the same form as the expression 
for E in an assembly of systems all of the same kind, viz., eq. (20). 
This leads to the expectation that each term in (43) may be looked 
upon as the average energy of the type of system it represents. This 
can be verified as follows. From the definition of average we have 



(44) 

The evaluation of the numerator is carried out by differentiating both 
sides of eq. (36) with respect to z and multiplying the result by z. 
Thus 



dz ^ 



Multiply further by Z^ 1 and get 



(46) 
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The coefficient of Z E in the expansion on the right is precisely the 
numerator of E\. Hence we again apply the contour integration and 
have 



rr *sc 



dz/z EJrl NiZ$*Z% 1 z -- log Z l 



I Z? 
Jc 



(47) 



The method of steepest descents applied to (47) then yields 



iCf), (48) 



which provides the confirmation of our surmise above that the total 
energy is the sum of the average energies of the component assemblies 
of different kinds of systems. 

We are now ready to give some reasons for the association of f with 
the absolute temperature of the assembly as assumed in eq. (30), viz., 
f = e~ l//kT . Suppose we consider first an assembly consisting of NI 
systems of only one kind. The average number of such systems in the 
rth energy state is then given by (42) (with f = fi) 

^Tr = ~r, ' ft, ff ir , (49) 

^i(fi) 
where fi is given by 

1 = JVifi^logZi, (50) 

the condition that fi shall provide the minimum value of <t>i(z) 
Zi(z)/z El/Nl along the real axis. Consider next an assembly of N 2 
systems of a second kind in which the corresponding average distribu- 
tion is given by 

N^r = 7-2r-f? 2r , (51) 



with f 2 corresponding to the minimum value of ^2(2) = Z 2 (z)/z E * /N * 
on the real axis. Note, of course, that in general it is unnecessary 
that f i = "2- However if the two assemblies are brought together and 



168 THE METHOD OF DARWIN AND FOWLER [Cn. VII 

form a single assembly in equilibrium, the values Ni r and N% r above 
given will necessarily be special cases of 

%f -tVaarS /ei\ 

N " = "z^T ' (52) 

in which a = 1 and 2 respectively for the two types of systems and 
f is one constant which is the same for both as long as they are in 
equilibrium. Hence when two assemblies in contact are in equi- 
librium, they must have associated with them the same value of f . 
This at once suggests the possible connection of f with the tempera- 
ture, since therm odynamically speaking it is the temperature which 
is the same for thermodynamic systems in equilibrium. It must also 
be remarked that from definition f must be a positive quantity. 

We should be more specific. Consider an assembly of linear simple 
harmonic oscillators. We shall suppose there are two types, one with 
frequency vi and the other with frequency v 2 . Let the numbers of 
the two types be N\ and N 2 respectively. Now the study of quantum 
mechanics reveals (cf. Chapter VIII) that the possible energy states 
of a harmonic oscillator do not form a continuous series but are dis- 
cretely distributed. In fact we have 



EH = U + i)*"i: E 2j = (j + %)h V2 . (53) 

At the same time the a priori weight factors are all unity. Thus 

gij = &j = 1 for all j. (54) 
Consequently the partition function Z x (f) becomes 



_ 

Since T 1 = 1/(1 - f*" 1 ) if j runs to infinity, we have 



i = j _ jk*! 
and similarly ^2/2 

z*(ft = rr? 72 ' (56) 

The fundamental distribution formula (52) with the appropriate sub- 
stitutions then yields for the average number of oscillators of the two 
types in the rth energy state, 

JVT, = #!(!- f*" 1 )^" 1 , 
_ (57) 

N 2r = N 2 (i - rt' A ". 
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The average energy values are given by substitution into eq. (50) 
with f ! = f . The result is 



< 58 > 
<59> 



Now EI includes both the kinetic and potential energies, which in 
classical mechanics are each equal on the average to kT/2 per particle 
(equipartition of energy). When PI and i> 2 approach zero we should 
expect the results of quantum theory to approach those of classical 
theory. Hence in the limit 



!* = - = lim I- 2 - (60) 



Now from (58) and (59) 



li m =-J-. (61) 



This follows from 

r hn = e- h9llog * = 1 - hvi log f + . (62) 

By comparison of (60) with (61) we get 

r - *- 1/kT , (63) 

as the indicated expression for f . In the next section we shall give 
still another demonstration of this relation. For the moment let us 
note that the above discussion need not be confined to a linear har- 
monic oscillator; the two- and three-dimensional cases are also easily 
handled. The reader may show that for the two-dimensional oscil- 
lator, for which Rj = (j + lt)hv from quantum mechanics and gj = 
j + 1, the partition function becomes 



- (64) 

and the average energy becomes 

+ p|^ I )- (65) 

Here we have for convenience omitted the subscript denoting the type 
of system in question. This leads to precisely the same connection 
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between f and T that we found in the one-dimensional case. For 
now 

E 2 



N logf' 

but since the oscillator has two degrees of freedom, in the limit of 
vanishing frequency its energy becomes 2kT, and (63) is again 
obtained. 

For the three-dimensional oscillator, quantum mechanical reason- 
ing 4 indicates Ej = (j + 3/2)hv and & = 1/2 -(j + 1)0' + 2). The 
result for the average energy is 

Z,,. 2Z,,. \ 

(66) 



and again (63) is found to be satisfied. 

The reader will find some interest in comparing the average energy 
for each type of harmonic oscillator obtained in the present discussion 
with J;he canonical distribution formulas in Chapter IV. Thus express- 
ing E in (65) in terms of T, the two-dimensional oscillator average 
energy is 

2hv 

(67) 

which is precisely the form of eq. (47) in Chapter IV with =J*T. 
It must be realized, of course, that E in (47) corresponds to our E/N 
in (67). The physical significance of average energy expressions like 
(67) will appear more clearly when we study the quantum statistics of 
radiation in the next chapter. 

4. STATISTICS OF AN IDEAL GAS 

Consider an assembly of N particles each of mass m and with 
negligible mutual interaction forces. The particles are confined in a 
vessel of volume r. This can be represented symbolically by assuming 
that the potential energy for any particle is V(x, y,z) every- 
where inside the vessel but rises abruptly to infinity at the walls 
and maintains this value everywhere outside, implying that no par- 
ticle is able to escape from the vessel. In applying the Darwin- 
Fowler method to such an assembly the essential matter is the evalua- 
tion of the partition function Z (eq. 11). The energy values Ej no 

4 Cf. for example, L. Pauling and E. B. Wilson, " Introduction to Quantum 
Mechanics," McGraw-Hill, New York, 1935, p. 100. 
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longer form a discrete set but are continuously distributed. We have 
indeed 

E = ^ (Pi + Pl + #J) + V(x, y, z). (68) 

If we suppose that the phase space of the assembly is divided into 
cells with phase volume Ay, where 

Ay = Ap xj Apyj &p z j A#y A^y Asy, (69) 

and assign one set of momentum and coordinate values to each cell, 
we can write 

v(xjt yj> z > } - (7o) 



From Sec. 1 the elementary weights are given by 

a = IT 

fl 

The partition function for this cell distribution becomes 



i> z j) (72) 

As the cells decrease in size and increase in number the sum above 
goes into an integral over the phase space, viz., 



Z = Jf"'*" 1 ^*' " Z> dp, dp v dp, dx dy dz, (73) 

where, of course, it is essential not to confuse the z which is the basic 
independent variable in Z with the space coordinate 0. The volume 
integration is to be conducted over the whole physical volume T and 
the momentum integration from oo to +00 for each component. 
This expression can be materially simplified by noting that 




(74) 
since V = everywhere inside the vessel. Therefore 

*>x dp y dp z . (75) 

s 

If we utilize the fact that 

bx z _ -6 X 2 log I/a 
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we can express the partition function in the form 

+ 00 

Z = ^ -fffe~ (p *+ p 2+ p * 2)/2m ' 108 1/f dp x dp y dp,, (76) 

00 

which at once yields 

_ r ( 2*m V* 

z ~ ' (77) 



In the interpretation of this expression it must be recalled that z < 1. 
We can now use the fundamental formula (24) to obtain the average 
number of particles in the phase element dp x dp y dp z doc dy dz. This 
number is A7V, where 

N ^ dp x dp y dp z dx dy dz ^ f(p 2 +p 2 +p 2 )/2m 

* -- (78) 



Therefore the probability of finding a particle within this phase ele- 
ment is 

(79) 



The probability that a particle shall have its momentum components 
included in the interval p x , p x + dp x , with no restriction on its posi- 
tion in the vessel, etc., is then 



P' ^pdxdydz = r 72 - dp, dp v dp,. (80) 



Placing p x = mv x , etc. and f = e~ 1/kT from (63), we obtain P f in the 
form of the Maxwellian velocity distribution already discussed in 
Chapter V (cf. eq. 51 of that chapter) and again by the Gibbs method 
in Sec. 10 of Chapter VI, viz., 

p> = 



Again we see that the connection between f and T in the form 
f = e~~ l ^ kT is definitely indicated. Still another way of looking at the 
same matter is provided by the classical equipartition principle, which 
in the present instance takes the form 
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Let us calculate E on the Darwin-Fowler method and see what form 
f must have to satisfy the equipartition principle. We have 

E = Nt^losZ([). (83) 

If the reader substitutes for Z the value given in (77) and performs 
the indicated operations he will come out with 

3 / A" 1 
= 2^V g f/ . (84) 

Equating this to 3/2-NkT gives again 

r = e- V " T , (85) 

which completes our validation of the connection between the Darwin- 
Fowler parameter f and the absolute temperature. 

The reader will probably have noted the close connection between 
the partition function (77) and the so-called distribution function 
obtained in Sec. 4, Chapter IV (eq. 83) for an ideal gas on the Maxwell- 
Boltzmann statistics. By allowing for the difference in notation for 
the physical volume the two expressions are in complete agreement. 
It is true that the general defining expression for Z in Maxwell- 
Boltzmann statistics in eq. (53), Chapter IV, appears not to agree 
with eq. (11) of the present chapter. The reason is to be found in 
the different definitions of the gj in the two methods. In the Maxwell- 
Boltzmann statistics the gj are genuine mathematical a priori proba- 
bilities which are proper fractions, while in the Darwin-Fowler method 
the gj are elementary weights which are integers. This explains the 
appearance of the factor n in the earlier definition of Z. Actually 
there is complete agreement between the two points of view and 
further evidence of this will appear as we proceed. 

5. THE CONCEPT OF ENTROPY 

We must now see how the idea of entropy fits into the Darwin- 
Fowler statistical method. Somewhere in the theory we must find a 
quantity which for an isolated system tends to increase. This quan- 
tity must finally enter into equations which are formally equivalent 
to the relations of thermodynamics. 

The reader will recall from Chapter VI (Sec. 13) that Gibbs intro- 
duced krj = k(E i/0/@ as the statistical mechanical analogue of 
entropy. This quantity was found to possess the necessary qualifica- 
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tions stated in the preceding paragraph. It has indeed the interesting 
property that it can be defined for non-canonical as well as canonical 
distributions and hence applies to both equilibrium and non-equilibrium 
states. In a certain sense therefore it over predicts experience, since in 
thermodynamics the entropy is defined for equilibrium states only. 
Since the Darwin-Fowler method confines itself to the calculation of 
averages over actual assemblies of systems, where these averages are 
then interpreted as the measured values of properties of the assemblies 
in states of equilibrium, we should expect that the Darwin-Fowler 
definition of entropy will apply only to equilibrium configurations. 
This has the possible advantage that it does not transcend experience 
like the Gibbs theory. On the other hand it has the disadvantage 
that no matter what precise definition is chosen, we cannot hope to 
show that the value of the entropy increases with the passage of time. 
All we can hope to do is to show that there is an increase in entropy 
when two assemblies in equilibrium are combined to form a new 
assembly. 

In view of the close fundamental connection between the Darwin- 
Fowler method and the classical Maxwell-Boltzmann statistics we 
expect that the definition of entropy in the former will follow the 
example set by the latter. It is natural to replace the " statistical 
probability " w (eq. 49, Chapter IV) by the expression S1F, the sum 
of weights entering into the denominator of eq. (4) of the present 
chapter. As before we shall divide 2W by Nl and finally define 

S = klogZW - klogN!, (86) 

where k is, as usual, the Boltzmann gas constant. The analytical 
expression for ^W in terms of a contour integral has already been 
given in eq. (12), Sec. 2, and the value in terms of the parameters 
of the assembly has been calculated in eq. (28), which we set down 
here again for reference 



where 



and <"(f) denotes the second derivative with respect to f. If we 
take the logarithm of SW, we obtain 



log ?w = N log z(r) - log r + i log 
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Let us look into the magnitude of the last term. Substitution of the 
expression for <(f) given above yields for this term 



JlogZCf) - ^logf - \ log 2,N - logf + ^logf 

- 1 log [z<o + |(| + i) zr'- f zr 1 ]. 



ess) 

The first, second, fourth, and fifth terms in (88) are very small com- 
pared with the first two terms in (87). Now we write further 



i- 1 ; Z"(f) = 
whence the bracket term in (88) becomes 

- 1 log f - [ZESW + | (l + |) Z - (l + f ) 
We can further write 



where E 2 is a kind of mean-square energy value, while 

2E jS t E * = EZ, 
where E = E/N. Consequently the bracket term becomes 



Now certainly E 2 E 2 is of the order of magnitude of E 2 or smaller. 
In neglectingjthe bracket we shall then be neglecting terms of the 
order of log E and log Z which are small compared with N log Z. 
The upshot is that, if we disregard the constant term J^ log 2irN 
(which since it is a constant will play no role in entropy changes and 
which in any case is small compared with the terms retained as long 
as N is large), we can get a very good approximation to log SW by 
retaining only the first two terms in (87) and writing therefore 

log S W = N log Z(f ) - E log f . (89) 

This should be compared with (63) in Chapter IV. 

Let us suppose that we have two assemblies composed of different 
types of systems and imagine that the two assemblies are joined to 
form a single one. From Sec. 3 we are justified in replacing the par- 
tition function Z by Zi 1/N Z^ 2/N 9 where Z\ and Z 2 are the partition 
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functions for the individual assemblies before the combination. Hence 
log 2 W now becomes 



log VW = Ni log Zi(f) + N 2 log Z 2 (f) - E log f . (90) 

Here, of course, we also have 

E = Ei + E 2 , 

where EI and E 2 represent the energies of the individual, isolated 
assemblies. 

Before combination the entropies of the individual assemblies are 
respectively 

51 = k[N l log ZxG-0 - E! log f ! - log tf , !], 

5 2 = k[N 2 log Z 8 (f 3 ) - E 2 log f 3 - log N a !], 

where we have used different values of f since the temperatures of the 
assemblies need not be the same. Let us find the condition laid on fi 
in order that Si shall be a minimum. This is clearly 



dS l 

(92) 



which yields the energy value 

,l7..(r.\ 

(93) 



which is just the energy value for equilibrium, as already determined 
previously in eqs. (20) and (48). Hence Si has a stationary value for 
the value of f for which the first assembly is in equilibrium with total 
energy E\. That this stationary value is a minimum is clear from the 
form of Si. In fact (91) allows us to write 



1 7 i L 1 \ J 1 / J 7| L ' \ J 1 / J /f\ A \ 

Si = k log N lh = k log (94) 

But we know from the method of steepest descents in Sec. 2 that $(z) 
has a minimum on the real axis for the value z = ft. This assures the 
minimum property of Si. In similar fashion we can show that S2 
has a minimum for the value of f for which the second assembly is in 
equilibrium with total energy E 2 . 

Nov r let the two assemblies be joined to form a single assembly and 
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let f correspond to the temperature of the combined assembly. We 
then have for the entropy 



S = 

- (E l + E 2 ) log f - log Nt ! - log N 2 !]. (95) 

This can be written as the sum of Si (f) and S 2 (f), i.e., the sum of the 
entropies of the two isolated assemblies at the same temperature, 
where, of course 

log Z&) - E 1 log f - log Nt !] 
log Z 2 (r) - 2 log f - log AT 2 !]. 

Now we have just shown that -5*1 (f) has a minimum for f = f i corre- 
sponding to (93) and 5 2 (f) a minimum for f = f 2 . Therefore unless 
f i = f 2 = it follows that 



2 (f 2 ) <S 2 fr). (97) 

Consequently except for this special case 

5(f) = 5x0-) + S 2 tt) > Sitti) + S 2 (r 2 ). (98) 

This shows that the entropy after the combination of the two assem- 
blies is greater than the sum of the individual entropies previously. Of 
course, if the two assemblies are already in equilibrium at the same 
temperature, the total entropy is unchanged, in agreement with the 
usual thermodynamical result. 

The final step is to show that the Darwin-Fowler entropy S = 
k log 2W k log Nl satisfies the fundamental thermodynamical rela- 
tion (the first law) 

dE + AA = TdS, (99) 

where we here temporarily denote the element of work done by the 
assembly by A^4 to avoid confusion in notation. 

For simplicity let us assume that the assembly consists of systems 
of only one kind. The possible energy states Ej will be functions of n 
external parameters which we shall denote by 1 ri , e.g., the vol- 
ume of the region occupied by the assembly. If no external influence 
is brought to bear on the assembly, the 's remain unaltered and with 
them the Ej values. Since the partition function Z depends on the 
energy states, it likewise is a function of the 's. Thus the total energy 
is given by 

...-^ (100) 
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and the change in energy associated with change in the temperature 
parameter f and changes di, d n in the external parameters can 
be conveniently written in the form 



J-l 
We calculate A4 by noting that is the force associated with the 

C/Ci 

(jJLL'i 

change in Ei due to unit change in ; . Thus dfy is the work done 

by a system of the assembly in the state Ei when y changes by d ; . 
The total work done by such a system when the changes Ji, d 2 , * d n 
take place is then 



Now on the average there are 



systems in the assembly in the energy state ;. Hence the total con- 
tribution of these to the work done is 



dj, (103) 

^ ^ 

and the work done by the systems in all states becomes 

|-% = AA (104) 



Now from the definition of the partition function it follows that 

d 



Z d 
log z(r, &,..)<*& 
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Hence by comparison with (104) we obtain 

N 



(106) 

*^fc> S T=TY ^^ 
Therefore 



^-(logZ)Jfyl. (107) 

dy J 

Now let us go back to the Darwin-Fowler entropy definition (86) and 
write 

5 = kN log Z - kNf log f ^-~ k log Nl (108) 



The change in S corresponding to changes df and d%i, d n then 
becomes 



. d log Z ^ , n f d 2 log Z 
-~~ - I 

of 



- kN~- d$ - kN{ log f | df 



y=i -^ ,=i 

Reduction of (109) and comparison with (107) yields finally 

dS 



klogf 



+ dE. (110) 



But since log f = 1/kT, this is equivalent to (99), further validating 
the Darwin-Fowler definition of entropy. 

A simple illustration of the preceding considerations is provided by 
the ideal gas of Sec. 4. The substitution of the partition function (77) 
into the entropy expression (108) immediately leads to 

5 = kNlog \-~ (2wmkT) 3 ^ +~ + kN, (111) 

which is identical with (88) of Chapter IV. If we form the total differ- 
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ential, treating r, the volume, as the sole external parameter, we 
obtain 

kNT E 

TdS = dr + f kNdT --~dT + dE. (112) 

But we recall from (82) that E, which represents the average total 

., f . 3NkT Jr 
energy, for an ideal gas is - and from this 

kNT 
TdS = dr + dE. (113) 



From the equation of state of the ideal gas, however, pr = NkT\ 
therefore kNT/r = p and pdr = dA and Eq. (113) becomes equiva- 
lent to (99). 

6. THE PARTITION FUNCTION AND GIBBS'S PHASE INTEGRAL 

The reader will have observed the close connection between the 
analysis in the preceding section and that in Sec. 3, Chapter IV. There 
we were still using w for the statistical probability but the analogy 
between this and SPF is very close. The question arises as to the 
connection between the Darwin-Fowler partition function and entropy 
and Gibbs's statistical mechanics. In the first place we note the inter- 
esting mathematical similarity between the partition function 

Z = S^ 
and the Gibbs phase integral 



In fact for a system composed of free particles, in evaluating Z we 
actually effectively computed /. It is well to note, however, that the 
two differ in their logical basis. In the Gibbs phase integral H is the 
Hamiltonian function for a dynamical system described by an ensemble 
of elements distributed throughout phase space. The various values of 
H for different parts of phase space are not different values of the 
energy of a particular component system; they refer to the fictitious 
elements of the ensemble. In the partition function, on the other hand, 
the EJ are the possible energy values of a component system forming a 
constituent of a whole assembly of systems. We recall that it is the 
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assembly of component systems in the Darwin-Fowler method which 
corresponds to the actual system described by the ensemble in the 
Gibbs method. Hence we expect that the formal analogy between / 
and Z will be of validity only for systems which can be described by a 
Gibbsian ensemble constructed for a single constituent of the system 
without bothering to construct an ensemble for the whole system. We 
found (cf .Sec. 10, Chapter VI) this to be possible for a system of free par- 
ticles. For a system of interacting particles, however, this procedure 
could not be carried out and here the analogy between / and Z could 
not be logically maintained. This naturally will not prevent the replace- 
ment of the summation in Z by an equivalent integration whenever this 
proves to be mathematically more convenient. Moreover it is quite 
conceivable that the simple Darwin-Fowler method may be general- 
ized to deal with systems in which the constituents interact with each 
other and for which in the partition function the Ej will refer to possible 
energy values of the whole assembly. As a matter of fact Darwin and 
Fowler do use such partition functions occasionally. Their theoretical 
justification will presumably rest ultimately on an appeal to the 
Gibbsian ensemble concept. 

In the next chapter we shall consider in detail the modification 
introduced into statistics by the advent of quantum mechanics. There 
we shall find it convenient to use as a framework the method of 
Darwin-Fowler or its equivalent, the classical method of Chapter IV. 
This must not be interpreted to mean, however, that it is impossible 
to develop quantum statistics by a generalization of the method of 
Gibbs. The close connection between the partition function and 
the Gibbs phase integral which we have just stressed suggests 
the possibility of translating the Gibbs statistical mechanics into the 
quantum mechanical terminology. This will involve the replacement 
of the concept of the motion of ensembles in phase space by that of the 
existence of definite and often discrete quantum states which alone 
specify the possible motions of physical systems. Integration of 
quantities over phase space will be replaced by summations over the 
discrete quantum states. In certain cases, e.g., at high temperatures, 
the quantum states may be crowded together so closely that for prac- 
tical purposes the summations may be replaced by integrals. In these 
limiting cases one therefore expects that quantum statistics will lead to 
the same result as classical statistics. 5 

6 For further discussion of this point, cf. Mayer and Mayer, " Statistical Me- 
chanics," pp. 123 ff, 218 ff, 240. (John Wiley & Sons, 1940.) 
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PROBLEMS 

1. Consider an assembly of N systems in which each system is a rotator about a 
fixed axis. Let there be n types of rotators with frequencies v\, v%- - -v n and the 
numbers of each type TVij-jj N n . JLJse the energy values given by the classical Bohr 
quantum theory. Find JV,> and /. 

2. Solve Problem 1 for the case in which the rotators have two degrees of freedom. 
For the energy values, cf. Lindsay and Margenau, " Foundations of Physics," p. 433. 

3. Find the partition function and the average energy for an assembly of N two- 
dimensional harmonic oscillators. 

4. Solve Problem 3 for the case of N three-dimensional oscillators. 

5. Find the expression for the entropy of an assembly of TV simple harmonic 
oscillators of the same frequency in terms of the total energy and the temperature. 
Solve the same problem for an assembly composed of n types of oscillators with 
frequencies v\, vi* 'v n . 

6. Find the expression for the specific heat at constant volume for an assembly 
of N simple harmonic oscillators of the same frequency. 

7. Find the expression for the specific heat at constant volume for an assembly 
of N simple fixed axis rotators with frequencies vi, v^- - -v n . (Cf. Problem 1.) 



CHAPTER VIII 
FUNDAMENTALS OF QUANTUM STATISTICS 

1. REVIEW OF QUANTUM MECHANICS 

The Darwin-Fowler method of developing statistical mechanics is 
well adapted to handle the modification in classical statistics brought 
about by the introduction of the quantum theory. Before we embark 
on the description of quantum statistics, however, it will be desirable 
to review briefly the fundamental principles of quantum mechanics. 1 

Quantum mechanics like its classical prototype deals with physical 
systems described by means of coordinates and conjugate momenta: 
we shall still be dealing with the qj and pj and the number of degrees of 
freedom of the system under discussion will still be denoted by/. The 
various properties of the system such as its total momentum, energy, 
angular momentum, are called observables, and it is the task of 
quantum mechanics to predict the allowed numerical values of these 
observables as well as their average values. In the first part of this 
program it differs decidedly from classical mechanics since in the latter 
all real values of observables are possible. In the second part of the 
program it reminds us of the fundamental problem of statistical 
mechanics. But the concept of the state of a physical system in quan- 
tum mechanics is very different from that in classical mechanics. In 
classical mechanics we know the state of a system if we have given 
the instantaneous values of the g's and the p's which characterize it, 
in other words, its phase. In quantum mechanics, on the other hand, 
the state is characterized by a certain function of the coordinates, 
known usually as a state function or $ function, 2 the only restrictions 

1 A more extensive survey, well adapted to the purposes of the present work, 
will be found in Chapter IX of Lindsay and Margenau's "Foundations of Physics." 
For the professional treatises on the subject the reader may consult Dirac's 
"Principles of Quantum Mechanics," Oxford Univ. Press, second edition, 1935; or, 
as more suitable for the general reader, Kemble's "Fundamental Principles of 
Quantum Mechanics," McGraw-Hill, 1937. 

2 The function is also often called a "wave" function. The reader must be 
careful to distinguish between the state function of quantum mechanics and the 
state variables of thermodynamics (Chapter III). 
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m which are that it must be single-valued and quadra tically inte- 
rrable, i.e., / $*</>dr exists, where 0* is the complex conjugate of <f> and 

IT is the element of volume in configuration space^ that is, the space 
>f the #'s. This purely abstract characterization must, of course, be 
upplemented by the statement that quantum mechanics provides a 
vay of assigning proper state functions to systems and ways of using 
he <t> functions to calculate possible and average values of observables. 
The process indicated is carried out by assigning to each observable 
,n operator, e.g., for the component momentum in the x direction p x , 

h d 

he differential operator . is chosen, where h is Planck's constant, 
2-m doc 

nd for the energy, the Hamiltonian operator, which is the Hamilton- 
in function with each momentum component in it replaced by its 
ppropriate operator. One of the fundamental assumptions of quan- 
um mechanics then is that the only possible values of an observable p 
Dr a particular system are the characteristic values of the equation 

P<t> = P4>, (1) 

rhere the left-hand side consists of the result of operating on the 
function with the operator P characteristic of the observable p and 
he right side is simply the numerical value of the observable multiplied 
ito the <t> function. In general it is found that only for certain values 
f p in (1) is it possible to obtain solutions for <t> which satisfy the 
Lindamental restrictions mentioned above. These are the possible 
alues of the observable and the corresponding <f> functions, usually 
enoted now as \l/ functions, are the corresponding state functions for 
he observable. Thus if fa is the state function corresponding to the 
alue pk of the observable, 

Pfa = Pkfa (2) 

j an identity with fa satisfying all the fundamental conditions imposed 
n state functions. It is customary to refer to the values pk as the 
igenvalues of the observable and the corresponding functions fa as the 
igenfunctions or eigenstates of the observable. For atomic problems the 
lost important form of equation (1) is that for which the observable 
; the energy. It then becomes the Schrodinger equation 

= Efi, (3) 



i which H is the Hamiltonian operator and E the numerical value of 
le energy. The eigenfunctions fa are functions of the configurational 
nd spin coordinates of the system but they are also characterized by 
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certain parameters, called quantum numbers. These are usually 
represented by n (principal quantum number), / (azimuthal quantum 
number), m (magnetic quantum number) and s (spin quantum num- 
ber). The eigenvalues E are of course characterized by different values 
of n, /, m, s. 

As a simple illustration the harmonic oscillator may be cited. Here 
the Hamiltonian operator is 

1 / h d\ 2 k o 
H = I . ) + - x 2 , 

2m \2m dx/ 2 

and eq. (3) becomes, with ^ in place of </> to conform to popular usage, 

~~-^ H ^ 1 E kx 2 1^ = 0. (4) 

dx 2 h 2 \ 2 / 

The eigenvalues of E turn out to be 

E n = (n + %)hv, (5) 



with v frequency of the oscillator = vk/m. The eigenfuncti 



. ons 

2?r 



are 




\ 

where Ii n I \/- jc ) is the so-called Hermite polynomial of order n, 
\*hv / 

the most compact representation of which is 




The final fundamental postulate of quantum mechanics says that 
when a system is in the state characterized by <#>, the expected average 
of an observable p from a series of measurements is given by 



<t>*P<t>d,T 
P = -- (8) 



/ 



Here as usual P<t> is to be interpreted as the result of operating on < 
with the operator P and it must be recalled that need not be an 
eigenstate of the system for the observable p. The most direct signifi- 
cance of the state function is found in the quantum mechanical 
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theorem that $*(Xi X/)<(Xi * * * V) measures the probability that 
the system in the state <t> will have its coordinates q\ - - g/ equal to 
Xi X/ respectively. More strictly <*(Xx \/)<t>(\i - \j)dqi- 
dq/ is the probability that the system will have its coordinates in the 
range dq\ dq/ in the neighborhood of the values \\ X/. In order 
that the system shall not be found at this place it is essential that 
(p*<t> vanish there. 

In order to understand quantum statistics it is necessary to consider 
-W identical, indistinguishable physical systems forming an assembly in 
the Darwin-Fowler sense. Since the systems are identical the Hamil- 
tonian has the same form for all, though each will be a function of the 
coordinates of the particular system in question. If all the coordinates 
of the jth system including the spin coordinates 8 are represented for 
simplicity by gy, and if each system is considered isolated from all the 
others, we shall have for the description of the energy behavior of the 
assembly the N Schrodinger equations 



(9) 



where the sequences of the eigenvalues Ei, Ej are really the same 
set in every case. The same is also true of the eigenfunctions \l/ lt \l/j - - , 
except that each is a function of the coordinates of a single system. 

If now we think of the assembly as a single system without however 
contemplating the mutual force interactions between the individual 
constituent systems, i.e., still envisage them as relatively far apart, 
the resultant Schrodinger equation will be 

(^ + H 2 + - - - + H N )t(q l9 gy) = EiKgi, g*), (10) 



where ^(gi, g#) is the eigenf unction for the assembly and E the 
corresponding eigenvalue. On examination eq. (10) is seen to be satis- 
fied by 



with 

E = Ei + Ej + . - - + E k . (12) 

This may be interpreted as meaning that ^ is the eigenstate of the 
assembly in which the first system is in eigenstate & corresponding to 

3 Cf. Lindsay and Margenau, op. cit., p. 478. 
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eigenvalue E iy etc., and E as the total energy of the assembly is the sum 
of the eigenvalues of the individual systems. Suppose we interchange 
two systems of the assembly so that system 1 now has energy Ej and 
system 2, energy ;. The total energy E is unchanged but \l/ becomes 

which in general is a different function from ^ in (11). Thus we have a 
different eigenfunction for the assembly corresponding to the same 
eigenvalue. This corresponds to what is called degeneracy in the state 
of the assembly. The degree of the degeneracy is the number of 
different eigenfunctions with the same eigenvalue. Clearly in the 
present case this number is Nl y since there are Nl ways of permuting 
the N sets of coordinates of the individual systems. If we denote any 
one of the functions obtained by such a permutation by ^p, it follows 
from the linearity of the equation (10) that the linear combination 

^ = Sap^p (14) 

is also an eigenfunction of the assembly corresponding to the energy E 
(assuming that the coefficients ap are so chosen that ^ is normalized, 4 

= i). 

We now introduce the Pauli exclusion principle which cuts the 
number of possible ^ functions in (14) to one by means of the following 
postulate: 

If the individual systems are elementary charged particles (in par- 
ticular electrons or protons) the only combinations of the form (14) 
realized in nature are antisymmetrical with respect to an interchange of 
the coordinates of two systems, i.e., such an interchange produces a 
change in sign without changing the value. Of all possible combina- 
tions of the form (14) there is only one which is antisymmetrical and 
this may be written in the form of the determinant 



= c 



(15) 



where c is a constant. An interchange of two q's is equivalent to an 
4 Cf. Lindsay and Margenau, op. cit., p. 413. 
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interchange of two columns of the determinant and this leads merely 
to a change of sign. It is an interesting consequence of the Pauli 
principle, that if two of the charged particles are in the same state, i.e., 
if i = Ji for example, two rows of the determinant become equal and 
the determinant vanishes with the concomitant vanishing of ^f. This 
may be interpreted to mean that it is impossible for two such charged 
particles in an assembly to be in identical states. This is often pre- 
sented as the statement of Pauli's principle. It is to be noted that the 
principle allows two elementary charged particles to be characterized 
by the same n, /, and m values provided the spin quantum numbers are 
different. There are only two possible values of the latter and their 
difference is interpreted physically as an opposition or antiparallelism 
of the direction of spin. It often happens that the numerical value of 
the energy depends very slightly on the spin and in that case the Pauli 
principle allows us to think of two elementary charged particles in 
practically the same energy state, their spins being opposed. 

Suppose in (15) we interchange two pairs of g's, i.e., qi with q 2 and q% 
with q$. Since each interchange involves a change of sign with no 
change in magnitude, the two interchanges will leave the sign unaltered. 
The state function ^ is symmetrical with respect to interchange of two 
pairs of elementary charged particles. If then each individual system 
of the assembly consists of a pair of such charged particles or indeed 
any even number of them, the wave function of the assembly must be 
such that an interchange of two systems leaves it completely unaltered. 
To represent such an assembly of composite systems we need a sym- 
metrical state function in place of the antisymmetrical one in (15). If 
the ^'s in (15) are still interpreted as representing the eigenf unctions 
of these composite systems we can easily get such a symmetrical func- 
tion from (15) by changing all the minus signs in the determinant 
expansion to plus. It can be shown indeed that the symmetrical 
eigenfunction thus obtained is the only possible symmetrical one. An 
illustration of a composite system of this type is provided by the 
deuteron, the nucleus of the hydrogen isotope of mass 2. Many 
neutral atoms are also of similar character. 

2. DISTRIBUTIONS IN QUANTUM STATISTICS 

We are now ready to apply the quantum mechanical ideas of the 
preceding section to statistics and statistical distributions. We have 
just seen that the state of an assembly of elementary charged particles 
is given by the one antisymmetrical linear combination of eigenfunc- 
tions for the individual systems corresponding to the particles. How 
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does this affect the statistical weight attached to the assembly, i.e., (1) 
of Chapter VII? For convenience we rewrite this here, viz., 

W= 



But this is just the total number of independent state functions which 
can be formed by taking the product of all the functions for the 
individual systems with all possible permutations of coordinates. To 
see this, note that a product of the kind in question is like (11), i.e., 



There are Nl possible products of this kind obtained by permuting 
the qi - qx in the individual $ functions. If the individual energy 
states, however, are themselves degenerate and the degree of degen- 
eracy of the ith state is gi, etc., so that there are gt independent state 
functions corresponding to the ith energy value, the number Nl must 
be multiplied by the g's where, moreover, each g is raised to the power 
of the number of systems in the corresponding state. This alone would 
give us for the number of state functions required 

Nlgg'tf* "''-. (17) 

With NQ systems in the zeroth state, NI in the first state, etc., a 
product like (11) becomes, for example, 



) ^0(22) ' ^0(2^0)^1(2^1)^1(2^0+2) 



Now the NQ\ permutations of the qi, q 2 , qN among the first N Q 
factors do not produce new and different states. As a result (17) must 
be divided by 



in order to get the actual number of independent state functions repre- 
senting the state of the assembly for a given value, , of the energy. If 
it were not for the Pauli exclusion principle we should expect to use 
the statistical weight (16) in our statistical calculations for quantized 
systems. However, the principle insists that actually the numbers 
NQ, NI - N r - cannot exceed unity for any assembly realized in 
nature, and that for any set of values satisfying this criterion W = 1 ; 
otherwise Wmust vanish. It is clear then, that the exclusion principle 
forces us to abandon (16) as an expression for the statistical weight. 
The new situation may be expressed in the following way. 
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For an assembly represented by an antisymmetrical wave function 
the statistical weight W has the following values 

W = lforN r = 0,1 (anyr), 

W = for all other values of N r . (18) 

With this we can now proceed to derive the corresponding distribution 
law. The average number of systems in the rth state is (cf. eq. (4) of 
Chapter VII) 

(19) 
(19) 

We shall first calculate the denominator which we recall is the sum of 
all weights of the assembly subject only to the conditions (2) and (3) 
of Chapter VII. We can avoid the analytical complexity introduced 
by these conditions by noting that 2W may be expressed as the coeffi- 
cient of X N Z E in the expansion 



M = Wx*? N ***, (20) 

No.Nt-.Nr>" 

where in the summation the N's may be any positive integers and are 
no longer restricted by the conditions mentioned. Now M may be 
rewritten 



M= jtfW'*'. (21) 

NtoNi~-Nr--- j 

In the evaluation of the product, if in any factor Nj is different from 
or 1 the whole expression vanishes since W = unless Nj is equal to 
or 1. Consequently (21) is the sum of all products of the form 



3 

where Nj = or 1 . This sum itself, however, is most simply written as 
a product, namely 

M = (1 + xz E )(l + xz El )(l + XZ E *) 



= n 



+ **o. (22) 



Thus the extreme terms in the sum represent respectively that in 
which every Nj = and that in which every Nj = 1. All other possible 
combinations are represented in the intermediate terms. 
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In Sec. 2, Chapter VII we showed that the coefficient of Z E in the 
expansion 



s 



Hence by the use of the same reasoning the coefficient of X N Z B in the 
expansion (20) for M will be given by 

M dxdz 
if i - 

2irtV 



where C and C' are closed contours about the origin in the complex 
plane and both x and z are considered to be complex variables. 

We next get the expression for the numerator in N r . Let us cal- 
culate I/log z - From (21) differentiation yields 
dE r 



<24) 

But *SN r W is just the coefficient of X N Z E in the expansion on the right 

of (24) and therefore the coefficient of the same power in 

log z dE r 

Consequently 



(25) 

We can put the term ; - into somewhat more suitable form 

log z dE r 

by writing 

M = JJ [1 + x (e l * ' )*'] (26) 

3 

whence 



The method of steepest descents may now be applied to the integral in 
(25) by treating the integrations with respect to z and x separately. 
Let the saddle point along the real axis for x correspond to 5 x n and 

5 The reader should be careful not to confuse the /x of this chapter with that of 
Chapter IV. 



192 FUNDAMENTALS OF QUANTUM STATISTICS [CH. VIII 

the saddle point along the real axis for z correspond to z = f . Then 
since the function in the integrand can be shown to satisfy the require- 
ments of the method we can write at once 

N r = M f log (I + M f*) = 17 f . r - (28) 

dp 1/M + f 

This is the distribution formula for an assembly of elementary charged 
particles. If we identify f with e~ l/kT j as in the classical case of 
Chapter VII, (28) becomes 



Assemblies with this distribution law are said to obey the Fermi- 
Dime statistics. This should then apply to an assembly of electrons. 
We note at once the difference from the classical statistical distribution 
law (31) of Chapter VII. The quantity /z appears as a new statistical 
parameter in addition to f . Its significance will be discussed shortly. 

Before investigating the application of (29) we ought to notice that 
there is another possible quantum statistical distribution law. This 
will hold for an assembly whose eigenfunction is symmetrical. Here 
the weight to be attached to the assembly is also equal to unity for 
there is still only one symmetrical state function associated with the 
assembly, but now all values of N r are possible since the symmetrical 
function does not vanish no matter how many individual systems are in 
the same state. Thus we write in place of (18) 

W= l,allN r . (30) 

In eq. (21) we must therefore now remove the restriction that Nj can 
have only the values or 1 in order to avoid a vanishing product. Thus 
M can now be written as the sum of all products of the form 



JJ 



in which Nj may take on any values; but this itself may be expressed in 
the form of a product of sums, viz., 

M = (1 + xz* 9 + 
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Consequently since x and z are both less than unity in absolute value 

< 32) 



The evaluation of N r proceeds precisely as in the preceding case with 
1 dM Mxz Er 



i * 

log z dE r 1 xz r dx 

The result is 



, , - 

= ~ Mxlog(l - ***') (33) 

r 



This is the distribution law for what is usually termed the Bose- 
Einstein statistics. With the usual substitution for f it becomes 



The only mathematical difference between (35) and (29) is the sign 
before the 1 in the denominator. This leads to a very fundamental 
difference in the physical meaning, however, as will appear shortly. 

It is instructive to observe that the expression for N r in the classical 
case of Sec. 2, Chapter VII (eq. 24 or 31) can also be obtained directly 
by the method of this section. For we can write 



M = 
in the form 

3 J ' 

Furthermore the sum of all the products indicated in (36) may be 
expressed as the product of sums, viz. 

TT\V B- TT ,-fi 

^ = ^nZ,fi*" 3 ' = JV! II> ' 

j n-0 "' j 

As before we form 

1 dM 



log z dE r 
and finally get 

AT t-Er Er/fcT / ? O\ 

Nr = M^r S M^r^ V^oJ 
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This is in the form of the distribution formula (24) in Chapter VII if we 
set 



This helps to make clear the physical significance of the distribution 
parameter /z in__the general quantum statistical case. It must be so 
chosen that S N r = N, the total number of systems in the assembly. 
This will serve to fix it in terms of N and f . The evaluation of ju will 
be carried out in the following sections. Independent justification of 
the association of with temperature in the quantum statistical case 
will also be given. 

3. ALTERNATIVE TREATMENT OF QUANTUM STATISTICAL 
DISTRIBUTIONS 

Before proceeding to apply the distribution formulas (29) and (35) 
for the Fermi-Dirac and Bose-Einstein quantum statistics to the 
properties of gases and solids it will be worth while to present an alter- 
native method of derivation. We shall here revert to the method of 
Chapter IV and shall not hesitate to use Stirling's formula. Moreover 
we shall derive the distribution laws for all three types of statistics. 

We wish to distribute an assembly of N objects, e.g., material 
particles, into energy states in such a way that in the states between 
Ej and Ej + dEj there are Nj particles, etc. To make the discussion 
more pictorial we visualize the energy interval as a region in phase 
space, a kind of energy shell containing all values from E 3 to Ej + dEj. 
In classical statistics this shell is conceived to contain a continuous 
range of energy values but in quantum statistics it is necessary to give 
it a structure and to suppose that it consists of cells associated with 
each one of which there is a definite possible state of a particle of the 
assembly. Let us suppose there are n 3 - of these cells in the jth energy 
shell. We shall shortly derive an expression for the dependence of n 3 
on Ej and dEj. 

We proceed first to distribute the particles in accordance with a 
suggestion of Brillouin. 6 Each cell is assigned a capacity dependent on 
the number of particles in it. In particular for a cell with p occupants 
the capacity is assumed to be 1 pa, where a is a real parameter which 
may assume different values, to be discussed later. The weight 
attributed to the jth shell is the number of ways of assigning Nj par- 
ticles to the HJ cells in the shell subject to the above capacity limitation. 

8 L. Brillouin, "Les Statistiques Quantiques," Vol. 1, pp. 167 ff., Paris, 1930. 
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The first particle may be placed in HJ ways, but the second in only 
nj a ways, since the cell already occupied has now a capacity of only 
1 a. The number of ways of assigning all Nj particles will be denoted 
by Wj, where 

It is, of course, assumed that nj (Nj l)a > and that algebraic 
meaning can be associated with (nj/a) \ etc. The weight corresponding 
to a distribution of the whole assembly in which there are NQ particles 
in the energy shell about Q> -^i in the energy shell about EI, etc. 
(with STVy = N, and 2NjEj = E = constant) is then the number of 
ways of assignment of the N particles to the shells multiplied by the 
number of arrangements among the cells in every shell. The total 
weight then becomes 

N\ 
w = 



We now follow the procedure of Sec. 2 of Chapter IV and inquire 
for the distribution corresponding to maximum w subject to the usual 
conditions on the total number of particles and the total energy. Using 
Stirling's formula and making some reductions, we have 

logw = NlogN - N - 2N,-logNj + 1/a-S^log - ~- 



j log (nj - aNj). (42) 
We now set 6 log w = subject to the conditions mentioned and get 
2 , Nj log (^^) - (43) 

subject to 

= 0. (44) 



Introducing the undetermined multipliers 71 and 72 and proceeding 
precisely as in Sec. 2 of Chapter IV we are finally led to 

Ni ^ -T7- (45) 

'Yo-C'i > ' 



nj a + e-^e"^ 1 

NJ/HJ is the average number of particles per cell in the jth shell for the 
distribution corresponding to maximum w, i.e., that which we termed 
the canonical distribution in classical Maxwell-Boltzmann statistics. 
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Coupled with a knowledge of HJ, eq. (45) constitutes the quantum 
statistical distribution formula analogous to the classical formula (25) 
of Chapter IV. If we set 

7,-- (46) 

the distribution formula becomes 

N ' /Hi = -"' 



We naturally seek a connection between (47) and the distribution laws 
(29), (35) and (38) already derived in the previous section by the 
Darwin-Fowler method. Let us first look at the matter from the purely 
formal mathematical standpoint. If a 0, (47) becomes 

Nj = n^e-**'**. (48) 

Here Nj has precisely the same dependence on Ej as ^ r on E r in eq. 
(38). It evidently is the classical distribution law. The two formulas 
become identical if we identify ngj in (38) with e* l rij in (48). What 
physical significance can we attach to the choice a = 0? It clearly 
corresponds to a constant capacity of unity for every cell independent 
of the number of particles in it. But this is just the classical assumption 
that every cell has the same a priori probability. The weight gj is 
then simply the number of cells available with energy JEy and corre- 
sponds to iij. The parameter /z will finally be associated with e* 1 . 

In looking for a connection between (47) and the quantum sta- 
tistical formulas (29) and (35) we are naturally led to try the assump- 
tions a = + 1 and a J_. The form of NJ/HJ in (47) with a = 1 then 
looks much like that of N r in (29) and there is a similar resemblance 
between (47) for a = 1 and (35), particularly if we agree to let e Jl 
stand for p. Unfortunately we should naturally wish to associate Nj 
in the one case with N r in the other, whereas the appearance of HJ in 
(47) appears as a sort of stumbling block. This difficulty is cleared up 
when we reflect that the N r and the Nj do not after all refer to the 
same thing. The rth energy state in the Darwin-Fowler method of 
Sec. 2 is a genuine microscopic energy level to which the specific 
energy value E r is assigned. In the Fermi-Dirac statistics only zero or 
one particle may exist in this state. On the other hand in the alterna- 
tive treatment of the present section Nj is the number of particles in a 
whole range or shell of energy values ranging from Ej to Ej + dEj and 
tij represents the number of possible or allowed energy states in this 
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range. Consequently the E r values in (29) refer strictly to the energies 
associated with the various cells in a particular shell in the Brillouin 
method, while the Ej value in (47) refers to the average energy over a 
whole shell of states. If we like we may think of (29) as defining a 
microscopic distribution and (47) as defining a macroscopic distribu- 
tion. That the distinction does not appear to be necessary in the case 
of classical statistics where, as we have just seen in the preceding para- 
graph, we are able at once to identify formula (48) with (38), is not so 
surprising when we consider that in classical statistics there is no real 
necessity for subdividing energy shells into discrete cells; there is 
indeed no prescribed lower limit to the size of the energy interval and 
no reason why we should not identify E r in the one case with Ej in the 
other. A similar statement holds for the identification of gj with Uj. 
However we are naturally more interested in the quantum statistical 
case. With the distinction between the two alternative points ot view 
held clearly in mind while we realize that there is no essential incon- 
sistency involved, we can proceed to use whichever form of distribution 
formula seems more convenient. For the present we shall continue 
our discussion on the basis of eq. (47). 

It is desirable, however, to pay a little attention to the physical 
significance which can be attached to the choices a + 1 and 1 in 
the Brillouin point of view. The assignment a = 1 implies that the 
capacity of a cell is unity when no particle is in it but drops to zero as 
soon as one particle enters. Consequently on this choice a cell may 
hold at most one particle ; there are only two possibilities, one or zero. 
The connection between this and the Pauli exclusion principle, which 
led to the Fermi-Dirac statistics from quantum mechanics, is clear. If 
on the other hand a = 1, the capacity of a cell with p occupants is 
1 + p, i.e., the capacity of the cell increases with the number of occu- 
pants. The connection with the Bose-Einstein statistical assumption 
expressed in (30) is not indeed so clear as in the corresponding case of 
a 1 and the Fermi-Dirac statistics. Nevertheless there appears to 
be no essential inconsistency between the two points of view. More- 
over we are not restricted to the values of a = + 1 and 1. For 
example, < a < 1 would imply a loosening of the Fermi restriction 
though at the same time allowing less freedom of occupancy of cells 
than the classical theory. Certain analytical difficulties with frac- 
tional a would indeed appear to arise from the fact that Wj in (40) must 
be integral. Closer inspection shows that these can be overcome, 
though we shall not pursue this possibility here. 7 

7 Cf. R. B. Lindsay, Phil. Mag. [7] 17, 264 (1934); also D. S. Kothari, Phil. 
Mag. 18, 192 (1934). 
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The next step in the exploitation of the distribution formula (47) 
is to establish the connection between the parameter 72 an d the tem- 
perature in a straightforward fashion. We shall use the elementary 
statistical-thermodynamical analogy (cf. eq. 68, Chapter IV) 

dE 
kd log w = , (49) 

where dE is the change in internal energy due to flow of energy, e.g., 
heat (not change in internal parameters). Since (49) holds for equi- 
librium states only we must use the maximum log w, i.e., that corre- 
sponding to the distribution law (45). The change in log w associated 
with changes dNj in the number of particles in the jth shell due to the 
inflow of heat is 



d log w = ZdNj log \ N ' (50) 

^ , * j _ 

But log - - = 71 y*Ej. Hence 

Nj 

dlogw = - 2(71 + y*Ej)dN i = - y&EjdNj = - y 2 dE. (51) 



We note that though the number of particles does not change when 
heat is added, the total energy changes by 2EjdNj since the effect of 
the energy flow is to alter the number of particles in the higher energy 
shells. Hence the analogy (49) leads to 



in agreement with our assumption (46). 

We shall conclude this section by a brief reference to the more con- 
ventional method of discussing the Fermi-Dirac and Bose-Einstein dis- 
tribution formulas in order to point out certain differences with the 
Brillouin method just described. 

Let us first consider the Fermi-Dirac case. The problem to be 
solved is still the distribution of N identical objects in energy shells 
with NJ in the j'th shell and the number of available cells in the jth 
shell being equal to Uj. It is assumed that no more than orle object 
can be put into any cell. The number of ways \n which NJ objects 
can be placed in HJ cells so that no cell contains more than one object 
is equal to the number of ways in which the tij cells can be divided into 
two groups with NJ in one group (those which contain one object) and 
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nj NJ in the other group (those which contain no object). From 
Chapter II this number is 



Consequently the total number of ways of distributing all N objects is 

n^io^wf < 4i/ > 

On this view this should be the statistical probability or weight associ- 
ated with the distribution. It will be noted at once that it is not 
exactly the value (41) obtained above for a = 1. Rather, it is equal to 
w/Nl 

We proceed next to the Bose-Einstein distribution by a similar 
direct method. We now wish to distribute Nj objects among nj energy 
cells in such a way that no restriction is placed on the number of 
objects in any cell. This is clearly the number of combinations of HJ 
things NJ at a time with all possible repetitions allowed. From the 
algebraic theory of combinations, this number is equal to the number 
of combinations of HJ + Nj 1 objects Nj at a time without repetitions. 
Consequently the required total statistical probability or weight 
becomes 



n 



N, !(,-!)! 



(4n 



Now if in eq. (40) we let a = 1, which should correspond to the Bose- 
Einstein distribution in accordance with our previous assumptions, 
the value of w becomes 

(Nj + Hj- 1)1 



W = 



- 1)! 



which again is just Nl times the value (41") obtained by direct count- 
ing. 

The question arises : What is to be done about the factor N ! which 
occurs in the Brillouin method of computing the statistical weights but 
is absent in the direct counting scheme? It will be recalled that this 
same factor occurs in the expression for w in the classical Maxwell- 
Boltzmann statistics in Chapter IV. There we found its presence of 
no particular moment as far as the distribution function is concerned. 
It proved indeed embarrassing in connection with the statistical 
definition and evaluation of the entropy and free energy; we found 
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it necessary to use w/N\ as an "effective statistical probability in the 
definition of entropy. If we use the Brillouin expression (41) for w the 
same situation will prevail in quantum statistics. We could, of course, 
redefine the statistical probability to bring it into agreement with 
(41') and (41") and this will be the more logical course if we consider 
the quantum statistical distributions to be the fundamental onesJL 
However we shall find it simpler to continue to use the Brillouin w ancT 
define entropy in terms of w/N\. All applications will then be per- 
fectly consistent. 

4. EVALUATION OF ny FOR AN ASSEMBLY OF FREE PARTICLES 

Before we can apply formula (47) to concrete cases we must have 
HJ or the number of cells in the jth energy shell. This is the same as the 
number of possible energy values lying in the interval Ej to Ej + dEj. 
Its value clearly depends on the nature of the assembly. The method 
of evaluation will be quantum mechanical which is reasonable since 
we are talking about quantum statistics. We shall confine our atten- 
tion to an assembly of free particles all having the same mass, the 
statistical analogue of an ideal gas. The problem is to determine the 
allowed energy values of the assembly. Since the particles are assumed 
to exert no forces on each other, these values can be readily calculated 
by direct summation from the energy eigenvalues for a single particle 
confined to a closed vessel. Suppose for convenience the vessel is a 
rectangular parallelepiped with dimensions / lf 1 2 , Is. The Hamiltonian 
for the particle is (p 2 + p 2 y + p 2 )/2m since, as the potential energy is 
constant it may conveniently be taken equal to zero. The correspond- 
ing quantum mechanical operator becomes (cf. Sec. 1) 

r 2 / *\2 *\2 

n id d c 

~ 8^ \&? + ~dy~ 2 + dz 2 ; 

The appropriate Schrodinger equation then becomes 

This may be solved by separation of the variables, i.e., by writing 
\l/(x t y, z) in the form of the product of three functions ^Oc), ^i/Cv)> ^zO&) 
depending respectively on x, y, and z alone. Thus 

\b = &!K\l/1 \^Z. ^53} 

8 This, for example, is the view of Mayer and Mayer, "Statistical Mechanics," 
pp. Ill ff. For another discussion of the problem see Brillouin, op. cit., Vol. 1, 
pp. 171 ff. 
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Equation (52) becomes consequently 

*V 



_ 

*, ' dz 2 ~ ' 

where Sir 2 mE/h 2 has been replaced by K 2 . Let us write K 2 = K? + *i + KS- 
Then we merely have to solve the individual ordinary differential 
equations 



The solutions are 

$x = A x e iKlX + B x e~ iKlX 

t y = A y e iK + B y e- iK (56) 

* z = Ate'** + 3,6-'**. 

These solutions are subject to the boundary conditions 

1^ = Oat* = 0, /!, 

*v = at y = 0, / 2 , (57) 

^ = at z = 0, / 3 , 

if we suppose that the container is placed with one corner at the origin 
and has the coordinate planes for three of its faces. These boundary 
conditions express the fact that the particle is confined to the vessel, 
i.e., there is no probability of its ever being found at or outside the 
walls. The conditions (57) lead at once to 

sin K^I =0; sin * 2 /2 = ^J s ^ n K B^3 = 
or expressed more explicitly 

HI* n 2 ir n 3 ir 

KI = . K2 = . ^ = . ( 58 ) 

/I / 2 /3 

where lf w 2 and n^ are any integers, positive or negative but not zero. 
The energy eigenvalues of the free particle in the vessel therefore are 

2 2 

(59) 

Corresponding to every set of integers HI, n 2 , n 3 , eq. (59) gives a 
possible energy value. The corresponding eigenf unction is 



, ^ . . . 

\l/ = C sin sin sin - , (60) 

/i / 2 /a 

where C is a constant determined from the normalization condition 
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expressing the fact that the probability of finding the particle some- 
where in the vessel is unity. This takes the form 



//3 X* ^2 f* ^1 
/ / \t\ 2 dxdydz= 1. 
/o */o 

Application of this yields 



(61) 



(62) 



where r is the volume of the vessel. Inspection of (59) now discloses 
that because of the small value of A 2 /8w in absolute units, if r is large, 
i.e. the vessel of macroscopic size, 1 cm 3 or larger, the energy eigen- 
values are very close together. In fact they may for practical pur- 




x 



FIG. 8-1. 

poses be considered effectively continuous. 9 In any case our problem 
is to determine from (59) the number of values of -E(m,n 2f ns) lying in 
the interval Ej to Ej + dEj. For this purpose we set up a three- 
dimensional rectangular lattice (Fig. 8 1) in which the sides of the unit 
cell in the three coordinate directions are l//i, l// 2 , and l// 3 respec- 
tively. Every point in the lattice is given by the three integers n\ , W 2 , n 3 
and the distance of the point from the origin is 



,n 2 ,nj) \l-o- + ~o 



(63) 



With each point in the lattice there is associated an allowed energy 
eigenvalue E( nit n ^ n ^ by (59). Since change in sign of n\ t n% t #3, or 
any one of them, does not produce a new energy value nor a new 
independent state function we limit the lattice points under considera- 

9 Cf. Lindsay and Margenau, op. cit., pp. 428 ff, for a discussion of this subject 
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tion to the first octant of the lattice space, i.e., that for which w lf w 2 , w 3 
are positive integers. The number of eigenvalues lying in the energy 
interval cited is then equal to the number of lattice points lying between 
the octant of the sphere of radius r$ corresponding to Ej and the octant 
of the sphere of radius rj + drj corresponding to Ej + dEj. Now each 
lattice point can also be associated with a lattice cell like that shown in 
Fig. 8-1. Consequently the number of lattice points in the spherical 
shell octant just mentioned will also be equal to the number of lattice 
cells contained in it and this in turn will be given by the volume of the 
shell octant divided by the volume of the unit cell, namely 1/^1/2^3 == 
1/r. Obviously the approximation involved in the last statement 
improves as r becomes larger. In insisting that r be macroscopic in 
size we are insuring a very good approximation. The volume of the 
octant in question in the lattice space is 



But from (59) and (63) 

2 / 1 /2w JT , 

ry = 7 V2mEj , dry = 7 -v dy, (64) 

/ ri EJ 

and therefore the number of energy values required, which is just the 
wy value we have been looking for, is 

n- = ?rr (2m)^E ^ dE - (65) 

' h 3 Ji- 

The quantum statistical distribution formula (47) then takes the 



(66) 



Once more we emphasize that in this formula a = + 1 corresponds to 
the Fermi-Dirac statistics and a 1 to the Bose-Einstein statistics. 
For particles with a spin quantum number ^ in which the energy is 
practically the same for both spin directions (cf . the remarks near the 
end of Sec. 1) it is necessary to multiply the right side of (66) by 2 to 
get the correct value of Nj. For in this case we can effectively have 
two particles (with opposite spins) for each numerical value of the 
energy. This will be true with free electrons. For molecules without 
a spin, (66) should apply as it stands. On the other hand 10 for mole- 
cules with nuclear spin quantum number s, Nj must be multiplied by 
2s + 1. 

10 Cf. Mayer and Mayer, op. cit. t pp. 135 f. 
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5. QUANTUM STATISTICS OF A WEAKLY DEGENERATE GAS 

We have still the task of evaluating the statistical parameter yi 
in the distribution expression (66). This can be done by utilizing the 
condition 2Nj = N, the total number of particles in the aggregate. 
In this connection considerable interest attaches in the first place to 
the special case a = 0. Then (66) should reduce effectively to the 
Maxwellian distribution of classical statistics. From the way we have 
expressed Uj we must write SAfy as an integral and have ll 



~ E/kT dE = N. (67) 



_ 
The integral in (67) is a well-known one and is equal to \/irkT. 

Hence the parameter 71 is given by 

Nh 3 I 



(68) 



Substitution into JVy, now rewritten as dN for greater consistency, 
gives 



for the number of particles in the assembly having energies lying in the 
interval from E to E + dE. This can be readily transformed to give 
the distribution in terms of velocity by writing E = mv 2 /2. The result 
is 



which, with allowances for difference in notation, is identical with the 
Maxwellian distribution in eq. (53) of Chapter V. This at any rate 
indicates that the general distribution formula (47) includes the 
classical distribution as a special case, viz., that for a = 0. 

We must now consider the general case where 1 < a ^ + 1 , with 
a = + 1 (Fermi-Dirac) and a = 1 (Bose-Einstein) as the interesting 

11 This is really a matter of convenience justified by the fact already emphasized 
that the differences between successive eigenvalues are very small compared with 
the eigenvalues themselves. To handle the matter by means of a summation would 
involve using (59) directly, leading to a rather difficult problem in algebra. 
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limiting cases. The evaluation of e yi proceeds as before, the condition 
2Nj = N now taking the form 

"/V^-*- 

There is a certain convenience in introducing the transformation 
E/kT = u. Then (71) becomes 

T-.-" <"> 

We shall set 7i = 7 and confine our attention for the rest of this 
section to the case in which 7 > 1 but is not sufficiently large for us 
to be justified in neglecting a compared with e y e u . Now we can trans- 
form the integral in (72) in the following fashion, letting a = =fc b, 
where b is always positive, though a may be positive or negative. 
For the Fermi-Dirac and Bose-Einstein statistics 6=1. For other 
intermediate brands of statistics < b < 1 . We shall carry through 
the general case. The integral in (72) then becomes 



r. 

Jo * 



1 f* \/udu 



+ a bJ Q 1 + ^-iog6)+u' 

Call e y ~ logb = e a . The foregoing integral is a special form of the 
integral 



with a > 1 and p rational and positive. Since we can expand the 
denominator of the integrand in the form 



it follows that 

Ufa a) = r p y\ &iy- l e-"* + >du. (73) 

^o frf 

Introduce the transformation ju t and utilize the fact that 

'e-'dt = I+l>. (74) 
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The result is the convergent expansion 

U(p, a) = - =F ~i + ~^i =F ' ' ' T(p + 1). (75) 



In the particular case of (72) p = 1/2. Recalling that r(3/2) = 
we finally have 



(76, 



From this the first approximation to a is obtained in the form 

- h3Nb 

6 ~ *' (77) 



It is interesting to observe that if we set b 1, this is just the expres- 
sion for 71 in the classical statistics distribution formula. The second 
approximation to e~ a is 

h 3 Nb [" h 3 Nb 

6 a " ' 



This approximation introduces a distinction between positive and 
negative a. In (78) the plus sign corresponds to positive a, while the 
minus sign corresponds to negative a. In the limiting cases in which 
a = db 1 we have 6=1. The value of e yi for the Fermi-Dirac and 
Bose-Einstein statistics respectively is given to the second approxima- 
tion (recalling that now e yi = e~ a ) by 

h*N [ h*N 1 

^Fermi-Dirao - r(2l , mkT ^ [/ + 2^r(2irmkT^Y 

_ h*N r _ h 3 N i 

e T1 Bose-Einetein ~ r(2vfnkT )H [* " 2 T(2irmkT) J ' 

For 1 cm 3 of hydrogen at room temperature, the expression 
h?N/T(27rmkT)* /2 has a numerical value of the order of 10~~ 4 . Hence 
a is considerably greater than unity and the approximation in (78) 
is a very good one in this case. For reasons which will appear more 
clearly in the next section a gas for which this is true is said to be 
weakly degenerate. For the present we shall use the term degeneracy 
as an indication of the deviation of the properties of a gas treated 
quantum statistically from those of an ideal gas treated by classical 
statistics. The measure of the degeneracy is the value of e n . For 
hydrogen under the conditions just mentioned the degeneracy is indeed 
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so weak that e n reduces for all practical purposes to the value for an 
ideal gas in classical statistics. As the temperature is lowered, e yi 
increases and the degeneracy becomes greater. However for hydrogen 
even at its critical point e yi is of the order of only 10~ 2 . 

The distinction between quantum and classical statistics will be 
further brought out by a calculation of the total average energy of the 
gas as a function of the temperature. This is given by 



We again make use of the integral U(p, a) in eq. (75) with p = 3/2, 
and can finally write 



3r 

T? _ _ 

2h 3 



, (82) 



where the upper sign refers to positive a and the lower sign to negative 
a. Combining this with the expression for N in (76) gives 



( 
*- 



If e~~ a is sufficiently small we can approximate this successfully by 
E = f NkT [l ^ - e- 2 "(-^- 2 + 1) - - - ], (84) 



where the upper sign now refers to positive a and the lower sign to 
negative a. If we set for brevity 

h 3 N 

2 ' (85) 



(78) with b = 1 becomes 

(86) 



We can then substitute into (84) and obtain for the approximate 
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expressions of the energy on the Fermi-Dirac and Bose-Einstein 
statistics respectively 



K 2K 2 1 

1+^572-^72 + ---J 



1+^572-72 + --- (87) 

[K 2K 2 1 

1-^72-^572 +] (88) 

These may be considered as correct approximations as far as terms of 
order K 2 are concerned. It is plain that for a weakly degenerate gas 
the energy differs but slightly from the classical value. 

We now wish to obtain the equation of state of a weakly degenerate 
gas. This involves getting the entropy and free energy. For the 
entropy we have agreed (Sec. 3) to use the definition 

S = klog^- (89) 

By inserting the value of NJ/HJ given in eq. (45) into the expression 
for log w in eq. (42), we are led to 



= k \ 



E/kT -yiN + -2_ nj log (1 + ce"*-*"* 7 ) . (90) 



From (65), replacing the summation by an equivalent integration, we 
have 



(2mf f VE log (1 + ae^e- E/kT ) dE. 
Jo 

Partial integration of the integral leads finally to 

5 = - k yi N + \ |- (91) 



The free energy is 

* = E - TS = yfeTi^r - |E. (92) 

The equation of state then becomes (eq. 12, Chapter III) 

P + (93) 

>r 3 ar 

From (78) with b = 1, we can evaluate 71 (recalling that e~ a = e yi ) 
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\Tf 

and from (84) we can get , being careful to keep terms of appro- 

OT 

priate order in each case. The final result is 

2E 

=< (94) 

3 r 

It is worth noting that this result, which we have already derived in 
the classical kinetic theory (eq. 12 of Chapter V) holds for both brands 
of quantum statistics. It is therefore a very general formula. 

The equation of state for a weakly degenerate gas in the two brands 
of statistics then appears in the approximate form 

PT = NkT ^1 + Ji _?*-+.. .j ... Fermi-Dirac (95) 

PT = NkT\l -J*-^!r+''')'" Bose-Einstein (96) 

As a consequence of these equations, a weakly degenerate Bose- 
Einstein gas is more compressible than an ideal gas, while a weakly 
degenerate Fermi-Dirac gas is less compressible than an ideal gas. 
Actually the differences are so slight that they are entirely masked in 
real gases by the departures from ideality owing to the forces between 
the particles. These are of course neglected in eqs. (95) and (96). It 
is only at very low temperatures that the terms in K and K 2 become 
appreciable and here the intermolecular forces become particularly 
significant. 

6. WEAKLY DEGENERATE GAS BY THE DARWIN-FOWLER METHOD 

It will be worth while to consider how the problem of Sec. 5 can be 
attacked from the standpoint of Sec. 2. For this purpose we need 
expressions for N and E analogous to eqs. (71) and (81) in Sec. 5. We 
also need an expression for the entropy. 

We proceed by analogy with the development in Chapter VII. 
There the sum of all the weights, 2W, was shown to be equal to the 
contour integral 



where Z = ^gjZ E * is the partition function. The expression for the 
energy in terms of the parameter f (later associated with the tempera- 
ture by f = e~ l / kT ) was obtained by expressing the fact that Z/z E/N 
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has a minimum at z = f on the real axis. This led to eq. (20) of 
Chapter VII. What is the analogous equation in the case of quantum 
statistics? In eq. (23) of this chapter the sum of weights is given by 
the double contour integral 

Mdxdz 



We expect to find the connection between E and f in this case by 
expressing the fact that (M/z E ) l/N has a minimum at z = f on the real 
axis. This demands that 

l/N 

(97) 



The differentiation yields 

)]. (98) 



If we introduce from eq. (22) the Fermi-Dirac expression for M, which 
now takes the form 

= II (1 + Mf*0 w/ , (Fermi-Dirac) (99) 



3 



the energy expression (98) can be written 

d v-^ 

F-D = f 2^ nj log(l 

S 3 

The parameter JJL is the value of the complex variable x along the real 
axis where (M/x N ) l / N has a minimum. The condition for this is 

(M/x N ) l/N =0 at x - M. (101) 

dx 

When the differentiation is performed the result is 



(102) 

where again we are of course employing the Fermi-Dirac M. The 
reader will observe that M(< ) in (99) is not quite the same M which we 
used in eq. (22). We have incorporated in it the exponent #y, the 
number of cells associated with the energy shell Ej. The reason for 
this step should be clear from the discussion in Sec. 3 where we com- 
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pared the Darwin-Fowler distribution formulas with those obtained 
by the alternative method. The Ej which we are using throughout 
our present development refers to a whole shell of values and not a 
single energy state. To it must therefore be attached the weight n 3 \ 
The expressions for E and N in the Bose-Einstein statistics are 
obtained by the same reasoning as (100) and (102). The only difference 
is the function M , which is now 

M(fl = IJ ( _ 1 E J 3 (Bose-Einstein). (103) 

The results are 

n,- log(l - M ^0 , (104) 



^B-E = - M - 2^ nj log(l - tf*0 (105) 

Let us go back and evaluate (102), using for this purpose the 
expression for HJ already obtained in (65) and replacing, as usual, the 
summation over j by an integration over the energy. We also set 
= e ~ l/kT . Then (102) becomes 



^ % log(l + v-* /kT ) VZ dE. (106) 
Differentiation under the integral sign is here allowed and the result is 



But this is precisely our eq. (71) with a =+ 1 with I//* in place of 
e~ 71 . In similar fashion 



- f 

*/o 



/o 
The differentiation leads to 



~ (2m)* log(l + M f E ) V dE. 



(108) 

With the transformation w = E/kT, this is identical with our previous 
eq. (81) when a = + 1. 

The reader may proceed to show that similar results are obtained 
with the Bose-Einstein statistics. It is clear that the straightforward 
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application of the Darwin-Fowler quantum statistical formulas leads 
to all the results of Sees. 3, 4, and 5. We should, however, examine the 
entropy. In Chapter VII, (eq. 86), we saw that in the Darwin and 
Fowler method the entropy is defined as 

S = JHogS W- klogNl. 
This was found in the case of classical statistics to lead to 

S = kN log Z(f) - kE log f - k log Nl (109) 

What is the corresponding expression in quantum statistics? This 
will presumably be obtained by replacing the partition function Z 
by its quantum statistical analogue. Actually to be rigorous one 
would have to evaluate the multiple contour integral (23) by a general- 
ization of the method of steepest descents used in getting the expres- 

/Z N 
jg +1 - dz of Chapter VII. 

This has been done by Fowler. 12 We shall not repeat it here but merely 
point out that 

M(x,z)dxdz 



l V C C 

^i) JcJc, 



cc, 
= k log AfOi, f) -kN log p-kE log f (110) 

to the same approximation which led to (109). The reader who has 
followed through the evaluation of (28) in Chapter VII will see the 
plausibility of (110) though for the rigorous demonstration Fowler 
must be consulted. Incidentally (110) shows that the quantum sta- 
tistical analogue of the partition function Z is 



If now we apply (110) to the Fermi-Dirac statistics we get, using (99) 
and p = e yi 

S = hnj (1 + e-e- E >' /kT ) - kN yi + , (112) 



which is exactly eq. (90) with a =+ 1. The reader can proceed to 
derive the corresponding expression for the entropy on the Bose- 
Einstein statistics and again obtain agreement with eq. (90), this 
time with a 1, of course. 

12 "Statistical Mechanics," second edition, pp. 47 ff. 
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7. FERMI-DIRAC STATISTICS OF A STRONGLY DEGENERATE GAS 

We shall now return to the general quantum statistical distribution 
law (47) and consider the case of Fermi-Dirac statistics (a = + 1) when 
7i is positive and e yi so large that for a considerable range of values of 
Ej, we have 1 2> e~~ yi e Ej ' /kT . In line with the discussion in Sec. 5 we 
shall refer to this as a case of strong degeneracy, since it necessarily 
implies a considerable deviation of NJ/HJ from the classical distribu- 
tion formula (48). 

As before we evaluate e yi by utilizing the expression for N in terms 
of 7i and T, i.e., eq. (72). For a = 1 this becomes 

Irs- (H3) 



The integral in (113) is a special case of the integral 

r^v- <"<> 

where 71 1 and p is rational and positive. Let us introduce the 
transformation yiy 71 + u and obtain 



U'(p, 7!) = 7f +1 / V t YJ (H5) 

J-i 1 + e 

This may further be written 

TT, f ^ p+J r d - yWy , r (! + y) >rf y1 

t/(p,7l)=7l / 71V + / ,, 

L ./o 1 f c? fc/o A -f- o J 

The first integral on the right becomes 

1_ _ f l (1 - yYdy 

( P + 1) 7 1 + e ' 
whence 



Now since 71 1, 1 + e jiv is so large that the integration from 1 
to co adds very little to the value of the second integral in the bracket. 
Consequently, we can write to a good approximation 
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Since y < 1 we can expand the numerator in the integrand by the 
ordinary binomial expansion and obtain for the integral 



<"> 

The binomial coefficients are written in the form ( P \ as in 

\2j+i/ 

eq. (7), Chapter II. The integral in (118) can further be written as a 
double sum by the expansion of 1/(1 + e 712/ ), viz., 



The next step is to let t = (k + I)TI^, whence the above integral 
becomes 



This suggests the gamma function, save that the integration limits 
are from to (k + 1)71. However we again use the fact that 71 is 



/ e 

very large to assure us that for any j the / - 9 . 9 is ncg- 

^ + 1)71 L(*+l)7l] ^ 

ligible compared with the integral in (120). Hence we can safely alter 
the upper limit in (120) to <*> and write 



Finally we have for U'(p, 71) 

^^+ 2 f r(2/ + 2)-| 

- (122) 



L 7=3 71 

where we have set 

(-1)* 



(* + 
Using (122) in connection with (113) enables us to write 






. (124) 
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For most applications it will be sufficient to confine our attention to the 
first term in the summation in the bracket in (124) and write 



jj _ r-Ll (2mkTy 2 y^ - 4- (1 2^ 

J.1 ~~ o \tift>K>J. ) i 1 I f^ 21 \^ *^/ 

Now 

(-1)* 1 , 1 1 = 7rf 

(* + I) 2 "" 2 2 3 2 4 2 '" " 12' 

Consequently to this approximation, the equation connecting N and 
7i becomes 

AT - T 
~~3fi 



. (127) 



Recalling that 71 5t> 1, the first approximation to 71 from (127) is 

h 2 



The second approximation proves to be 

(129) 



The total energy is given by eq. (81) with a 1. In terms of the 
general integral U'(p, 71) in (114) it is given by 



r\ 

fl 



- , l . (130) 



To the same order of approximation as that used in eq. (127) the 
result is 



E = ~ (2m*D Tl [1 + ST 2 /87?1. (131) 

On dividing (131) by (127) and using the second approximation for 
7i, i.e., (129), the energy may be expressed in the form 



10 m 

The interesting difference between this result and that for an ideal 
gas on classical statistics is at once apparent. In the latter case we 
have simply E 3/2-NkT, and as T approaches zero, E also ap- 
proaches zero. For a strongly degenerate gas, on the other handj 
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the energy approaches a non-vanishing value at the absolute zero. 
This is known as the zero-point energy of the gas. Its value is 






The form of E in its dependence on T in (132) is also very different 
from that for a weakly degenerate gas, given in eq. (87). Of course, 
we cannot let T in (87) because the approximation which enables 
us to write (87) no longer holds for very small T. 

The existence of the zero-point energy is of sufficient importance 
for us to look upon it from another point of view. Of all the cells in the 
phase space associated with the gas only one can correspond to the 
energy value zero. On the Fermi-Dirac statistics there can not be 
more than one particle in this cell, if it is a particle without a spin. As 
we have seen in the discussion at the end of Sec. 4, there may be two 
electrons of opposite spin in each cell. For the moment, however, we 
confine our attention to ordinary gas molecules. Now at the absolute 
zero of temperature w r e should expect to find all the particles in the 
lowest energy states, i.e., all cells filled up to a certain maximum 
energy value. Consequently the total energy in this situation cannot 
vanish. This crowding of the particles into the lower energy states at 
T = is another physical interpretation of the degeneracy of the gas. 
At T = the degeneracy is complete. As the temperature rises some 
of the particles leave the lower energy states and are transferred to 
higher states; the degeneracy decreases. When the distribution of 
particles with respect to energy has become Maxwellian the degen- 
eracy has effectively disappeared and the gas is classical. 

The ideas of the previous paragraph can be used to give an alter- 
native calculation of the zero-point energy. We recall from Sec. 4, 
eq. (65) that the number of energy values for an aggregate of free 
particles lying between Ej and Ej + dEj is 

27TT 

A 3 

If all energy levels from zero to some upper limit max are each occu- 
pied by a single particle and the total number of particles is N, the 
maximum energy value is given by 

*\/EdE = N. (134) 
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The integration yields 

17 ** I I /f2Z\ 

Umax 1 I * \*<^*'/ 

The total energy of the whole degenerate assembly then becomes 

So = 2? (2m^ f l 
Jo 



10 m 

in precise agreement with the value given in (133) and computed from 
(132). 

We can also calculate the entropy for a degenerate Fermi-Dirac 
gas. We have already noted that eq. (91) is general and does not 
depend on the degree of degeneracy. Substitution of 71 from (129) and 
E from (132) into (91) yields 

, . . . .. , t _ ,_ 

_|_ terms involving higher powers of T. (137) 



- - -- 
h \47rr 

As T > 0, it follows that 5 and the entropy of a strongly degener- 
ate gas vanishes at absolute zero. 

The remarks immediately above about the physical interpretation 
of statistical degeneracy gain even more significance from a plot of 
NJ/HJ as a function of velocity magnitude. For the non-degenerate 



FIG. 8-2. 



gas (eq. 48) this, of course, is just the probability curve already shown 
in Fig. 2- 1 of Chapter II. The corresponding plot for a strongly de- 
generate Fermi-Dirac gas is given in Fig. 8 2. For a very considerable 
range of values of the velocity Vj from zero up, NJ/HJ remains prac- 
tically constant at the value unity, indicating that all the lowest energy 
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cells are occupied, each with its single allowed particle. As the higher 
velocities are approached, however, NJ/HJ drops more or less rapidly 
to zero, the rate of fall depending on the actual value of 71 and there- 
fore largely on the temperature. In this part of the curve the classical 
Maxwellian distribution is simulated to a certain extent. The value 
of the velocity at which NJ/HJ J^, i.e., half the cells are occupied on 
the average, is 

< 138 ) 

wherein we must use the value of 71 given in (128) or (129). The 
result of the substitution from (128) is 




(139) 

m 



The average energy per particle at T = is, from (136), 

. 

(140) 



N 10 m 
This leads to a root-mean-square velocity at absolute zero of 

*>m = Vf V Q . (141) 

8. FERMI-DIRAC STATISTICS OF A DEGENERATE ELECTRON GAS 

In the preceding sections we have not specialized the assembly of 
free particles to which the quantum statistical distribution formulas 
have been applied, save in so far as we have assumed them to be par- 
ticles without spin. In the present section we shall consider specifi- 
cally an electron gas in which the mutual interactions are neglected. 
To make the problem more specific we shall assume that the number 
of electrons per cm 3 is of the same order of magnitude as the Loschmidt 
number for an ideal gas. For example the gas might consist of the 
(hypothetical) free electrons in a metal in which on the average at 
room temperature there is one free electron per metallic atom. This 
will indeed form an interesting and important application of the theory 
of this section. The first thing we must note in the present discussion 
is that NJ in eq. (66) must now be multiplied by 2 in order to take 
account of the electron spin. This factor will follow through all the 
significant formulas. The easiest way to take care of its introduction is 
to replace r by 2r wherever it occurs. 
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The next question to settle is this: Is the assembly of electrons 
degenerate or non-degenerate at room temperature? Going back to 
eq. (77), we recall that if e~ a = e yi (b is equal to unity, of course) is 
very small compared with unity, the degeneracy is extremely weak. 
This is true for actual gases over a wide range of temperatures. For 
the electron gas, however, such as we might expect to find in the metal 
silver with one free electron per atom, substitution of m = 9 X 10~" 28 
gram, N/r = 5.9 X 10 22 , T = 300 K, into 

h 3 N 



e " = 



2r(2*mkT)* 



gives e a of the order of 2,330, which is far from being small. Evi- 
dently for such an assembly even at very high temperatures e~~ a will 
still be much greater than unity, indicating that below temperatures 
of the order of 10,000 K, an electron gas of the kind considered will 
be strongly degenerate and must be treated by the appropriate form 
of the Fermi-Dirac statistics. The formulas (129) and (132) are the 
ones we must apply. 

It is interesting to calculate the specific heat of such a degenerate 
electron gas. The specific heat per electron at constant volume is from 
(132) 

- % 



Per gram atom we have for the heat capacity 



v ~ h 2 rr/ - A 3 

where L is the Avogadro number and R the gas constant per gram atom. 
If now we compute the right-hand side of (143) for T = 300 K for the 
case of silver, already mentioned, we get approximately 

C v = 2.4 X 10- 2 . 

The interesting thing about this result is that the classical equi- 
partition principle when applied to an electron gas gives C v =* 3R/2, 
independently of the temperature, which is a much larger value than 
that above at room temperature. If we are considering an electron 
gas in a metal with approximately one free electron per metal atom, 
the contribution of the free electrons to the specific heat of the metal 
at room temperature is therefore relatively very small. Practically 
all the observed specific heat is due to the atoms themselves. This 
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solves a fundamental difficulty that has always plagued the classical 
electron theory of metals. This difficulty was the one of allowing 
enough free electrons to account for the observed electrical properties 
of metals and at the same time keeping the specific heat down to the 
experimentally observed value which at ordinary temperatures is given 
very closely by C v = 3R (law of Dulong and Petit. Cf. Chapter IX, 
Sec. 3). In the classical theory the two requirements seemed quite 
irreconcilable. Quantum statistics appears to solve the problem very 
nicely. 

9. BOSE-EINSTEIN STATISTICS OF A STRONGLY DEGENERATE GAS 

We have discussed weak degeneracy from the standpoint of both 
types of quantum statistics and strong degeneracy from the standpoint 
of the Fermi-Dirac statistics. What constitutes strong degeneracy in 
the Bose-Einstein statistics? The fundamental distribution formula 
for the latter type is _ 

_ lirmT/hZ-V^jd^ 
3 ~ - yiE/kT ( ' 



For weak degeneracy 71 is negative so that e~ yi > 1 and the first 
term in the denominator outweighs the unity even for small values of 
Ej. For strong degeneracy in the Fermi-Dirac statistics 0~ 71 <3Cl. 
Evidently such a situation is meaningless in the Bose-Einstein case as 
it would lead to negative values of Nj for a considerable range of 
values of Ej. Consequently we are led to the conclusion that the 
strongest degeneracy, i.e., the greatest deviation from classical statis- 
tics, possible for a Bose-Einstein gas corresponds to e~ 71 1 or 71 = 0. 
This has some interesting consequences. In the first place, the total 
number of particles in such a strongly degenerate Bose-Einstein gas is, 
from eq. (76) 

00 

N = jp (2Tf*r) 2 ^ = tf- (2kT), (145) 

n-1 

where the factor 2.61 is an approximation to the summation indicated. 
For a given volume of such a gas with a given number of particles, 
eq. (145) prescribes the temperature at which complete degeneracy is 
attained. A glance at (76) shows that if the gas were slightly less 
degenerate, i.e., e~ a < 1 slightly, N would become smaller. By no 
change can it become larger than the value given in (145). Conse- 
quently we can make the statement that for a degenerate Bose- 
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Einstein gas the number of particles per unit volume has an upper 
limit which at temperature T is 

AT 1 ti\ 

(146) 



This corresponds to a maximum equilibrium density for any particular 
temperature, a situation very different from that holding for an ideal 
gas. Any larger density than that given by (146) will necessarily cor- 
respond to a non-equilibrium state. 

We can evaluate the energy for the case of complete degeneracy by 
reverting to eq. (81) and computing E for yi = 0. The result is given 
by (82) with a = and the choice of the lower sign throughout. 
Thus 

00 

~ 5T (147) 

Since ^ ^ l/n?' 2 = 1.34 approximately, the expression for E takes the 

n 1 

approximate form 



The equation of state in the completely degenerate condition can now 
be found at once from eq. (94) . It is 

(148) 



In other words, the pressure is dependent on the temperature alone. 
This is, however, the situation one encounters in the pressure of sat- 
urated vapor and suggests that the strongly degenerate Bose-Einstein 
gas behaves like a gas below its critical point, so that when the temper- 
ature is reduced to a low enough value a certain kind of "condensation" 
takes place, removing from the higher energy states a certain number 
of particles and transferring them to the lowest energy state where 
they make no contribution to the pressure of the gas, a possible effect 
in the Bose-Einstein statistics that is of course not present in the 
Fermi-Dirac statistics. This l 'condensation" phenomenon has been 
studied in detail by F. London 13 and possible applications discussed, 
particularly to the interesting properties of liquid helium near its 
transition point at 2.19 K. 

13 F. London, Phys. Rev. 54, 947 (1938). 
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Equation (147') cannot be used to derive the expression for the 
specific heat at constant volume. For this purpose we must use the 
closed form expressions (72) and (81). From (81) we have 



where the integral certainly remains finite as 71 > 0. The factor 

( ~ ) can be evaluated by differentiating eq. (72) (with a = 1) 
dl / T 

with respect to 7". We obtain after conducting a partial integration 



However, as 71 > the integral in the denominator approaches 
infinity and hence in the completely degenerate state the specific heat 
is given by 

/AT?\ S 77 

(151) 

From (146) combined with (147') we then obtain the approximate 
result 

C v = 1.9R. (152) 

10. STATISTICS OF A PHOTON GAS. RADIATION LAW 

Perhaps the most interesting application of the Bose-Einstein 
statistics of a strongly degenerate gas lies in the field of radiation. Let 
us consider a radiation field in physical volume r to be equivalent to a 
collection of light particles or photons corresponding to a wide range of 
frequencies. With a photon of frequency v is associated energy hv and 
momentum hv/c, where c is the velocity of light in free space. We 
desire the distribution formula for the photons with respect to fre- 
quency, i.e., for a radiation field in the equilibrium state corresponding 
to temperature !T, the average number of photons with frequency 
lying in the range from v to v + dv. We shall assume that the photon 
gas is effectively a collection of free particles obeying the Bose-Einstein 
statistics and in a state of strong degeneracy at all temperatures. 

Obviously we cannot apply directly the distribution formula (144) 
with 71 = 0, since there is no meaning attached to the mass of a photon. 
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However, we can rewrite (144) in terms of the momentum pj in place 
of energy in the numerator and so get rid of m. For a free particle 
we have at once 

E -& 

* 3 ~ 2m 

Consequently the distribution with respect to momentum (with 
7i = 0) 14 becomes 

_ 47TT 
^J ~~ 1,3 



If now we set pj = hvj/c and Ej = hvj, the above expression becomes 



Let us make the further assumption that the photons, like electrons, 
possess a spin. This effectively doubles the number N } - (cf. the corre- 
sponding situation in the case of electrons which we take account of 
by multiplying r by 2 wherever it occurs). When we leave off sub- 
subscripts and denote by dN the number of photons in the frequency 
range from v to v + dv, we have for the average number of photons per 
unit physical volume in the range mentioned 



Since each of these photons has energy hv, the average energy density 
in the radiation field for frequency v becomes 

(156) 



This is the radiation law of Planck which has been found to be in sub- 
stantial agreement with experimental observations. Its derivation 
confirms that a photon gas may be considered to behave like a strongly 
degenerate Bose-Einstein gas. 

For very low frequencies or high temperatures (156) can be very 
accurately approximated by 



dE v = - -5 (157) 

14 It should be stated that in the case of a photon gas 71 = follows at once 
from the simple fact that the number of particles is not fixed and hence the condition 
*LbNj has no longer a meaning. The parameter 71 then really does not enter 
the problem. 
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or in terms of wavelength interval (X = c/v) 

<*.*. (,58) 

Equations (157-158) are equivalent to the Rayleigh-Jeans law, which 
describes the distribution of energy in a radiation field rather accurately 
for long wavelengths. On the other hand for high frequencies or low 
temperatures (156) takes the approximate form 

(159) 



, - 

the well-known distribution formula of Wien. 

From the historical point of view it is interesting to recall that 
Bose's original deduction of Planck's radiation law was the starting 
point of quantum statistics. This is reason enough for reviewing 
Bose's method 18 here, even though it is, of course, somewhat off the 
track of the systematic development of this chapter. 

In considering the distribution of objects in boxes or cells let us for 
the moment focus our attention on the cells rather than the objects. 
Thus of a total of Q cells we let Q Q be the number which contain zero 
objects; Qi, the number containing one object; (?/, the number 
with j objects, etc., with 

N 

i = & (160) 

where N is the maximum number of objects per cell. The number of 
ways of selecting the Q cells so that QQ contain no objects, Qi one 
object, etc., is 



n 



We now proceed to make II a maximum subject to the conditions that 

N 

/ j EjQj = E = total energy = constant, 

N 

SQi = Q = total number of cells = constant. (162) 

. 

Here Ej must now mean the energy associated with the cell which 
contains j particles. Proceeding as usual in the case of a canonical 

15 Z. Physik, 26, 178 (1924); 27, 384 (1924). 
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distribution (Sec. 2, Chapter IV) and calling the parameter = kT, 
we find - B i /lcT 

(163) 



3 = 

The average energy per cell appears as 



Now if the particles being distributed in the cells are photons we must 

assume that ^ ., /4,,-x 

Ej = jhv. (165) 

Hence N 



(166) 



-jhv/kT 
c/ 



j-0 
But oo 



p ~h v /kT 

"" e 



and if the maximum number of photons per cell is large, as we have 
reason to suppose will be the case, we can replace the finite sum in the 
denominator of (166) by the infinite sum in (167). If we differentiate 
(167) with respect to T, the result is 



j-0 VA 

from which (166) can be written in the form 

hv 



If the photons are considered to form a gas of free particles in a volume 
T with momentum values hv/c, the average number of them in the 
energy interval hv to hv + d(hv) will be, from Sec. 4, 

(169) 
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We have here assigned two photons to each energy value. Since the 
average energy per photon in this range is given by (168) it follows that 
the average energy density of the radiation in the frequency range from 
v to v + dv is given by (156), i.e., the Planck law again. 

11. APPLICATION OF QUANTUM STATISTICS TO ATOMIC STRUCTURE 

One of the most striking applications of quantum statistics is the 
determination of the average distribution of charge in an atom. It will 
be recalled that in the nuclear atom model an atom is assumed to con- 
sist of a nucleus with a positive charge and most of the mass of the 
atom surrounded by an aggregate of electrons equal in number (for 
the neutral atom) to the charge on the nucleus. It is the problem of 
atomic structure to determine the possible energy values of such a 
system as well as the average distribution of charge considered as 
effectively continuous. The solution also provides the potential field 
in the neighborhood of the atom, which is important in many applica- 
tions of atomic structure. The complete formal solution is indeed 
possible only occasionally, i.e., when there is but a single electron, e.g., 
hydrogen, ionized helium. Various approximation methods have been 
devised to solve the problem for polyelectronic atoms. One of the 
most interesting of these was proposed by Fermi, 16 and independently 
by Thomas. 17 Their assumption is, in effect, that the electrons in a 
polyelectronic atom behave as a degenerate gas obeying the Fermi- 
Dirac statistics. If we assume that the electrons in an atom occupy a 
sphere of radius approximately 10~~ 8 cm, the density of the electron 
distribution will vary from 10 24 to 10 26 per cm 3 . From the considera- 
tions of Sec. 8 of this chapter, we see that such a gas will indeed be 
degenerate for all ordinary temperatures. 

The problem now in question differs from that previously discussed 
in Sees. 7 and 8 in one important respect, namely that whereas there 
we considered the electrons in the gas as being free and therefore 
possessing only kinetic energy or at most existing in a field of constant 
potential, we must now think of them as moving in a variable field and 
for this reason as possessing, in addition to kinetic energy, potential 
energy varying from point to point. It will be seen, however, that the 
only effect of this is to make the velocity distribution a function of 
position in space. In fact we can use the ordinary Fermi distribution 
law (eq. 47) with a = + 1 merely by incorporating with the energy Ej 
the added potential energy e V, where V is the potential of the field 

18 Z. Physik, 48, 73 (1928); 49, 550 (1928). 
Proc. Cambridge Phil. Soc. 23, 542 (1927). 
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and a function of x, y, z. Thus we can now write for the number of 
electrons having total momentum lying in the interval (p, p + dp) at 
the point (x, y, z) 

p 2 dp 



This follows readily from eq. (66) with the appropriate representation 
of Ej in terms of momentum. Strictly speaking it is necessary to 
generalize the derivation of HJ in Sec. 4 to include the constant poten- 
tial energy term. When this is done, eq. (170) follows. The total 
number of electrons per unit volume at (x, y, z) then becomes (using 
the first approximation for N in eq. (127) with r replaced by 2r to take 
care of the spin and 71 replaced by 71 + eV/kT) 

N 8 ^ / eV\ 



It is now found convenient to set 

v = 7 + :iL-ii f (172) 

where v thus appears as the potential of the atomic field expressed to an 
arbitrary additive constant kTyi/e. Equation (171) then becomes 



n = 



(2*me)*v*. (173) 



The equivalent charge density, which we shall designate by p, is equal 
to ne. If we treat this as a continuous static distribution, it must 
satisfy Poisson's equation, viz., 

V 2 z, = - 47rp = ^^ (2vme)*ev*. (174) 

Ofl 

As a first approximation we shall take the distribution to be spherically 
symmetrical, so that v is a function of r only, where r is the distance 
from the nucleus. Then 



with 

C = (2,*) V*. 

The solution of (175) must be found subject to the boundary condi- 
tions s* 

lim rv = Ze; / ndr = Z, (176) 

r-O J 
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where Z is the number of positive charges on the nucleus, i.e., the 
atomic number, and the integration is to be extended over all space. 
The first of the above conditions expresses the fact that when an elec- 
tron is very close to the nucleus its potential is practically that due to 
the nucleus alone. The second condition expresses the fact that the 
total charge in the distribution corresponds to Z electrons, for the 
neutral atom. This will obviously not hold for an ionized atom, but 
the necessary alteration is fairly obvious and introduces no funda- 
mental change in principle. 

Equation (175) may be considerably simplified by the substitutions 
x = f/(Ze)~^C~ % and f = v/(Ze)*C*. The equation then becomes 

^f + 2^^ (177) 

dx 2 x dx 

subject to the boundary conditions 

lima* = 1, C ^x 2 dx = 1. (178) 

*-* Jo 

Finally we set <j>(x) = x% and reduce (177) to an equation without the 
first order term, viz., 



The boundary conditions (178) take the final form 

/OO _ 
^/x^dx = 1. (180) 


Eq. (179) has been solved numerically by Fermi and the solution is 
tabulated in the accompanying table. 

TABLE OF VALUES OF </>(#) IN (179) 



x <j>x x 0* x <t>(x 

1.000 1.5 315 10.0 0.024 

0.1 0.882 20 244 11.0 0.020 

0.2 0.793 2 5 0.194 12 0.017 

3 0.721 3.0 157 13.0 014 

4 0.660 35 130 14.0 012 

0.5 607 40 0.108 15 0.011 

06 562 5.0 079 16 009 

7 0.521 6.0 0.059 17.0 0.008 

08 485 70 046 18 007 

09 453 80 037 19 0.006 

10 425 90 029 20 0.005 
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We can use this solution to obtain the potential v and the charge 
density p. Thus 



and 



This solves the problem of the statistical charge distribution in the 
atom. We should expect the resulting potential field to be a fair 
approximation to the actual field, and indeed one which improves as 
the number of electrons increases. As a matter of fact the approxima- 
tion is found to be a very good one for all atoms, except in the outer 
regions of the electron distribution. It is a very useful approximation 
in various applications of atomic structure, particularly those encoun- 
tered in the structure of metallic crystal lattices. 

PROBLEMS 

1. In a cube of gold, 1 cm on a side, assume that the number of free electrons is 
equal to the number of atoms. Calculate a few energy values for the electrons for 
small values of HI, nz, n$ in eq. (59) and do the same for some larger values of these 
integers, e.g., of the order of magnitude of 10 2 . Compare the spacing of successive 
energy levels in both cases. 

2. If the free electrons of the cube of gold in Problem 1 form a degenerate gas at 
absolute zero with all energy levels filled, calculate the maximum kinetic energy. 
Also calculate the total kinetic energy of the electrons. 

3. Calculate the total energy of one mole of oxygen at its critical temperature, 
treating it as a Bose-Einstein gas. 

4. At what temperature (order of magnitude) does the electron gas in metallic 
gold cease to be degenerate? 

5. Prove that the number of combinations of HJ objects Nj at a time with all 
possible repetitions allowed is equal to the number of combinations of HJ -f- Nj 1 
objects Nj at a time without repetitions. Hence derive the Bose-Einstein statistical 
probability (41"). 

6. Show that the isothermal compressibility of an ideal gas is equal to the recip- 
rocal of the pressure. Find the expressions for the isothermal compressibility of a 
weakly degenerate Bose-Einstein and a weakly degenerate Fermi-Dirac gas. 

7. Use eq. (94) to find the approximate equation of state of a strongly degenerate 
Fermi-Dirac gas. Compare the isothermal compressibility with that of a weakly 
degenerate Fermi-Dirac gas. 

8. Derive the expression for the entropy of a strongly degenerate Fermi-Dirac 
gas out to terms of the order T 2 . 

9. At what temperature does the atomic heat capacity of the strongly degenerate 
electron gas in metallic silver become equal to the value predicted by the classical 
equipartition principle? 

10. Compute in tabular form the Fermi-Thomas statistical charge distribution 
in the neutral sodium atom in its normal state. 



CHAPTER IX 
SPECIFIC HEATS OF GASES AND SOLIDS 

1. SPECIFIC HEATS OF AN IDEAL MONATOMIC GAS 

The two specific heats at constant volume and constant pressure 
constitute important characteristics of a gas. In this chapter we dis- 
cuss in systematic fashion their theoretical evaluation for certain types 
of aggregates. They have already been defined in Sec. 4, Chapter III 
and discussed in preliminary fashion in Sec. 2, Chapter V. If the 
average total energy of the aggregate is E and its total mass is M , the 
specific heat at constant volume is 



^M,,, F ' (1) 

and that at constant pressure is 

_ _L f^\ P. (**Y\ 

Cp = Ti\^) p + 'M\dr); (2) 

In this section we shall be concerned with an ideal monatomic gas 
consisting of N free mass particles of mass m and each possessing three 
degrees of freedom. We shall follow the Darwin-Fowler method in our 
discussion and use the energy expressions obtained in Chapter VII 
and Chapter VIII. For a classical ideal gas, we have 



(3) 
< 

where Z(f ) is the partition function 



and f = e~ l/kT . The Ey in the sum are the possible energy values of 
any particle in the aggregate. If we transform from f to T, (3) becomes 



(4) 
230 
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and the specific heat at constant volume is 



From eq. (100) of Chapter VIII we can write the corresponding expres- 
sion for an ideal gas obeying the Fermi-Dirac statistics as 1 



The corresponding specific heat at constant volume for a Bose-Einstein 
ideal gas (from 104 of Chapter VIII) 



We have already used (6) or its equivalent in evaluating the specific 
heat of a degenerate electron gas (Sec. 8, Chapter VIII, eq. 142). We 
have also seen that for an actual gas composed of neutral molecules 
for which the Bose-Einstein statistics is indicated, the modification 
due to the use of the former in place of classical statistics is so small 
under normal conditions as to be negligible. Consequently in what 
follows we shall use the classical statistical formula (5) for specific 
heat calculations. 

The problem reduces essentially to the determination of the parti- 
tion function Z(T) given in the classical statistics by (3'). For an 
ideal gas this is given by eq. (77) of Chapter VII. We write it again 
for convenience, 2 expressing it as a function of T. 

Z(r) = ^(2*m*n M . (8) 



The application of (5) yields at once the familiar result 

"-is- <" 

and the associated value of c p also follows directly. 

2. THE IDEAL DIATOMIC GAS 

The ideal diatomic gas consists of an aggregate of freely moving 
molecules, each of which is composed of two similar atoms. The 

1 It should perhaps be stressed that in (6) and (7) only the differentiation of the 
bracket is to be conducted at constant volume. 

2 In this chapter we use V for volume. 



232 SPECIFIC HEATS OF GASES AND SOLIDS [Cn. IX 

determination of the allowed energy values of a diatomic molecule is a 
problem in quantum mechanics which we shall not work out here. We 
merely remind the reader that the energy of such a molecule is made up 
of three parts, viz., (a) energy of translation of the center of mass, (b) 
energy of rotation, and (c) energy of vibration. The possible energy 
values for translation have been worked out in Sec. 4, Chapter VIII 
and are 



If we simplify by assuming that the molecules are confined to a cube of 
side / and volume V = / 3 , the above expression becomes 

(n\ + nl + nl). (10) 



Here n\, n 2 , and n% can take all integral values. 

The possible energy values for rotation for a two-dimensional 
rotator are 3 

ju + Dh 2 

, (11) 

where j takes on positive integral values and J is the moment of inertia 
of the molecule considered as a dumbbell. / is taken about an axis 
perpendicular to the line joining the atoms and passing through the 
center of mass. If indeed the molecule consists of two different atoms 
of masses mi and m% respectively separated by an equilibrium distance 
a, we have 

. (12) 



In the special case being considered in this section, mi = m% m and 



The possible energy values for vibration are taken to be those for a 
simple harmonic oscillator of frequency v, namely (cf. eq. 53, Chapter 
VII), 

E v = 0* + )*% (14) 



where j takes on positive integral values including zero. 

3 Cf . for example, Lindsay and Margenau, "Foundations of Physics," p. 435. 
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The total energy of the molecule will then be expressible as the sum 
of the energies E tt E rt and E v . The writing of the partition function 
(3') might appear to offer some difficulty because of the presence of 
the weight factors g/. These can be omitted explicitly, however, if we 
incorporate them into the energy levels themselves by agreeing to 
repeat in the summation terms which have weight greater than unity. 
We then write 

Z(T) = 2 e 



the sum being taken over all values for all three energy types. Because 
of the exponential form, simplification is possible. Thus 



Z(T) = Z-* / * r -Z-* /M '.Z*-* /Mi f (16) 

or 

Z(T) = Zt(T)-Z r (T).Z v (T). (17) 

We can then determine the partition functions for translation, rotation, 
and vibration separately and find the total function by multiplication. 
Let us first investigate the partition function Z r (T). From quan- 
tum mechanics it develops that each rotational energy state j has a 
weight 2j + 1 associated with it. Hence 



Zr (T) = 

y-o 

It turns out 4 that (18) withj allowed to take all positive integral values 
is the partition function for rotation for diatomic molecules composed 
of different atoms. If, however, the atoms are identical, i.e., the mole- 
cule is that of an element, not all values of j are allowed in the summa- 
tion ; rather only the odd values or the even values are permitted. The 
reason for this is connected with nuclear spin and is adequately 
explained by Mayer and Mayer in the reference just given. We shall 
denote the partition function for which j is allowed to assume even 
integral values by Z re (T) and the one for the odd values by Z ro (T). 
The evaluation of the sum in (18) is a matter of approximation, the 
ease of which is determined largely by the value of h 2 /8ir 2 IkT, which, 
following Mayer and Mayer, we call a for brevity. The values of <rT 
for certain molecules are taken from Mayer and Mayer and presented 
in the accompanying table. For most diatomic molecules <r <<C 1 for 
room temperature and indeed or < 1 for temperatures in the neighbor- 

4 Cf. Mayer and Mayer, "Statistical Mechanics," pp. 150, 172. 
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hood of the boiling points, the only exception being provided by 
hydrogen. 

TABLE OF aT VALUES FOR DIATOMIC MOLECULES 

Molecule aT (in degrees C) 
H 2 84.97 

I 2 0.053 

N 2 2.85 

O 2 2.06 

HC1 14.95 

If a is the order of 0.5 or greater 

Z r (T) = 1 + 3e~ 2ff + 5e~ 6a + 7<T 12<T (19) 

is a sufficiently good approximation. For smaller values of a, Mayer 
and Mayer have used the Euler-Maclaurin summation formula and 
have obtained the approximation, convergent for a- < 1 



3 ' 15 ' 315 ' / (20) 

They also find that for small o-, i.e. less than 0.2, 

7 (T\ _ 7 fr\ _ 7 CT\ o\\ 

^re\ ) ^ro\ ) 2 Zy "v 1 / V^ 1 / 

The contribution of the rotational energy states to the specific heat 
can now be obtained at once from eq. (5) with the result that 



CF = _ 1+ + +...I (22) 



for o- ^ 1.0. This holds for small o- independently of whether the 
atoms of the molecule are the same or different since the derivative of 
the logarithm of Z re or Z ro is the same as that of Z r (from 21). Equa- 
tion (22) possesses considerable interest. It shows that for high tem- 
perature (for which a * 0), cy > k/m. As a matter of fact this lim- 
iting value is closely approached even at room temperature for nitrogen 
and oxygen and is reasonably well approximated even for hydrogen. 
For these gases under these conditions, then, the total contribution to 
the specific heat from translation and rotation becomes (3/2 + l)k/m 
= (5/2) -k/m. This confirms the conjecture in eq. (32), Sec. 2 of 
Chapter V, which is well substantiated by experimental observation. 
It suggests that at and above room temperature the vibrational energy 
states contribute little to the specific heat of a diatomic gas. We shall 
look into this theoretically presently. In the meantime we note that 
as T grows smaller, a increases and the contribution of the rotational 
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energy to the specific heat increases. However with further increase of 
<r, i.e. beyond unity, eq. (22) ceases to hold and a new expression based 
on (19) is indicated, but it is unnecessary to derive this in order to see 
that as a increases, ultimately Z r approaches the constant value unity 
and Cy approaches zero. We therefore expect at very low temperature 
little contribution to the specific heat from the rotational energy states. 
This agrees with the experimental observation that the specific heats of 
diatomic gases at constant volume decrease as the temperature 
decreases. 

We now investigate the partition function for the vibrational energy 
states. From (14) this becomes 



Z V (T) = <rCH-H>w*r (23) 

y=o 

Here the elementary weights gj all reduce to unity. Since e~ hv/kT < 1, 
we can write 



J. 

From this 



Z,(T) = e- _ 

" h ' /2 "^f- (24) 



log Z,(T) = - ~ - log (1 - e- h ' /kT ), (25) 



and eq. (5) yields, after carrying out the indicated operations 

k A A 2 e h " /kT 



(26) 



as the contribution of the vibrational energy states to the specific heat. 
At very high temperatures hv/kT becomes considerably less than unity 
and inspection shows that under these conditions Cy > k/m. How- 
ever at room temperature and below, hv/kT > 1, e.g., for H2 at 300 K, 
hv/kT = 20. Under these circumstances 5 

k (hv/kT) 2 



v ~m e hv/kT ' 

6 For data on values of the frequency v for diatomic molecules, see John C 
Slater, "Introduction to Chemical Physics," pp. 132, 141, McGraw-Hill, New York, 
1939. 
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or a very small fraction of k/m. Compared with the contributions of 
the translational and rotational energies, that of the vibrational energy 
at room temperature is negligible. This of course finds confirmation in 
the experimental results already referred to. The increase in specific 
heat at very high temperatures due to the vibrational energy contribu- 
tion is also borne out by experimental measurements. 6 

So far as the use of statistics is concerned the same method outlined 
above for the determination of the specific heats of diatomic gases is 
also directly applicable to polyatomic gases. The quantum mechanical 
problem of determining the energy states of rotation and vibration is 
of course much more difficult because of the greater number of degrees 
of freedom involved. 7 

3. SPECIFIC HEAT OF A CRYSTALLINE SOLID 

A crystalline solid is an aggregate of atomic particles whose posi- 
tions of stable equilibrium form a regular three dimensional array or 
lattice. The atoms do not remain at rest in their equilibrium positions 
but move about them in vibratory motion with frequencies determined 
by the nature of the forces acting between them. Specifically a solid 
crystal consisting of N similar atoms has 37V degrees of freedom, and 
from the classical mechanics of vibrating systems it follows that such a 
system can oscillate in 3N harmonic modes. Quantum mechanically 
the possible energy values of the crystal are given by 

3JV 

(27) 



where the 3N values of vj are the frequencies corresponding to the 
possible modes of oscillation and nj can take on all positive integral 
values for each j. Equation (27) is the generalization of (14). 
The partition function for the crystal becomes 



n=0 



^-S i (/+ /*. (28) 

n;=0 

8 Cf., for example, the data on CO quoted by Slater, op. cit., p. 145. 
7 The reader will find an adequate discussion in Mayer and Mayer, op. cit., 
pp. 179 ff. 
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The exponential can be written as a product and hence Z can be 
expressed as a product of sums. Thus 



Z(r) 



8* / - 

=n(a- g -*v4 (29) 

the second step being equivalent to that taken in (24). Further 

3JV 

log Z(T) = -^ [hvj/2kT + log (1 - -*'> /M )]. (30) 

j=i 

We now make the assumption that the same statistical method 
which we have applied to gases should also hold on suitable modifica- 
tion for the 3N particles of the solid crystal. Examination of the 
fundamental postulates in Chapter VII shows that this is a justifiable 
procedure. Only one slight modification is necessary. In our previous 
discussion of ideal gases the partition function referred to a single 
particle of the gas. Now the partition function in (28) refers to the 
whole assembly of oscillators. Consequently we must alter the fun- 
damental formulas (3) and (4) giving the energy of the system in terms 
of the partition function by leaving out the factor N, since it already is 
included in Z. Thus we now have for our crystal in place of (4) 



(31) 
The corresponding specific heat expression becomes 



For the crystalline solid this takes the form (from 30) 



Nm ST [ dT 
The result of the carrying out of the indicated operations is 



k V^ (h.,,..-, - 

C v = -^z2^ (jiw - *\* ' < 34 ) 

j^l \ e L ) 
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Let us suppose that the temperature is high so that for every *>/, 
hvj/kT <C 1. Each term in the sum in (34) becomes practically unity 
and the result is 

c v = (35) 

A more accurate approximation is 

c v = [l --Ly^f^) + ... 1 (36) 



The gram molecular heat capacity at high temperatures should thus 
approximate for all monatomic crystalline solids 

C v = 3R = 5.96 cal/degree C. (37) 

This is the law of Dulong and Petit. For most monatomic crystalline 
solids it is in excellent agreement with experiment even at room tem- 
perature. 

At low temperatures (34) approaches the asymptotic form 

3N 



indicating that as the temperature approaches absolute zero, the 
specific heat should approach zero. This is in general agreement with 
experiment though the precise rate of approach to zero is not in accord 
with the exponential law (38). 

The endeavor to render the theory of the specific heat of a crystal- 
line solid more exact necessitates the evaluation of the sum in (34) 
and this in turn demands a knowledge of the frequencies vj. The most 
important attempt at a solution of this problem is due to Debye. 8 
Debye replaces the sum in (34) by an integral over the whole frequency 
range from zero to a certain maximum frequency and seeks the dis- 
tribution of the oscillations among the various frequencies by finding 
the number of possible modes of oscillation of the crystal lattice in 
each frequency interval. 

Let us suppose that the crystal is replaced by an equivalent elastic 
solid medium taken in the form of a cube with side equal to /. The 
assumption was made by Debye that the actual lattice vibrations are 

8 Ann. der Physik 39, 789 (1912). Cf. also the discussion of the Debye theory 
in Slater, op. cit., pp. 222 ff. and Mayer and Mayer, "Statistical Mechanics," 
pp. 248 ff. 
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the same as the allowed modes of elastic wave vibration (stationary 
elastic waves) in the equivalent continuous elastic medium. The 
latter are solutions of the wave equation for the two types of elastic 
waves in a solid medium subject to the boundary conditions at the 
faces of the cube. The two wave equations are respectively 9 



and 

/ n \ 

(40) 

Here 8 is the dilatation in the medium and eq. (39) represents the 
propagation of d as a longitudinal wave with velocity 



k + 4n/3 
ci = " 



where k is the bulk modulus, n the shear modulus, and p the density. 
In (40) is a transverse displacement propagated with velocity 



(42) 



We assume that the solid cube is traversed by plane harmonic waves of 
both longitudinal and transverse types. Such a wave will correspond 
to a displacement in the form 10 

A t e -*r<**++T\ (43) 

where A is the amplitude of the displacement and a, #, y are the direc- 
tion cosines of the normal to the plane wave front. The imposition 
of the boundary conditions assures that these progressive plane waves 
will become stationary plane waves with space part in the form 



. 2irv . . 

sin ax - sin fiy sin yz, (44) 

c c c 

9 Cf., for example, Stewart and Lindsay, "Acoustics," pp. 328 ff. D. Van Nos- 
trand Co., New York, 1930. 

10 Cf. Slater and Frank, "Introduction to Theoretical Physics," p. 253, 
McGraw-Hill, New York, 1933. 
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where not all values of the frequency v and the direction cosines 
a, j8, 7 are allowed, but only those which satisfy the boundary condi- 
tions, expressible in the form 



Sin al = sin fil = sin yl = 0. (45) 

c c c 

These are satisfied for 

n\c n 2 c n%c , . 

va = _^. v p . V y _. _^ /46) 

21 21 21 

where n\, n 2 , n 3 are any three integers. Since a 2 + ft 2 + y 2 = 1, 
(46) suffices to fix the allowed frequencies of plane stationary waves 
in the solid as 

V =: ^V L\ T~ Wo ~T~ WQ 1 4 / ) 

flj , 7l2i ^3 O7 * ' * ' * \ / 



To each triplet n\, n 2 , n% corresponds an allowed frequency and a 
direction for the associated plane wave motion. The problem to be 
solved is to find the number of allowed frequencies in the given fre- 
quency interval P, ? + dv. This is mathematically identical with the 
problem solved in Sec. 4 of Chapter VIII and we shall not need to 
present the analysis but shall merely give the final result, remarking 
only that for each direction we shall expect to find one longitudinal 
wave and two transverse waves. The total number of modes of plane 
wave harmonic vibration of the solid for which the frequency lies in 
the interval mentioned then becomes (with V = I 3 = volume of the 
cube) 



7V(v)rfi/ = 4?rF^ 2 I -3 + -3 ) dv. (48) 

\$ C\ / 

We now follow Debye in the hypothesis that to a first approximation 
we can replace the real discrete distribution of frequencies in the 
actual crystal lattice by the continuous distribution (48). Since the 
total number of allowed frequencies is 3JV, there must be an upper limit 
to v in (48). In fact we can compute this j/ max from the condition that 

/**N(v)dv = 47rF(^ + ^3) / ma V<fc> = 3N. (49) 

\f C\ / fc/o 

This results in 

,,3 _ 
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We can use (50) to express (48) in the form 

N(v)dv = -3-^- dv. (51) 

"max 

The specific heat expression in (34) can now be written in the integral 
form 

f *lY J^/*r 

9k r*\kT/ 2j . x 

Cv = ^__ i -j^jf -T ? (fo. (52) 



Let us abbreviate by writing w = hv/kT, whereupon (52) becomes 

9k 

1) 



/max ^4 i 
7~^ 
v^ 



It is usual to introduce the so-called Debye characteristic temperature 
defined by 



D = , (54) 

so that 

W =^- (55) 

"'max rr> \J 

Equation (53) expresses cy as a function of T/&D. The integration 
will not be undertaken. However we can get one interesting result 
at once by noting that 

' w 4 e w dw 47r 4 

(?^1? = TJ- (56) 

Hence for low temperatures where w max may be expected to be large, 
the specific heat is given by 





cv = TTJT- (57) 

5 m D 

This dependence of the specific heat on the third power of 7" for small 
T is well substantiated for a number of substances. 

The accompanying figure (Fig. 9-1) gives a plot of cy as a function 
of 77 D. This should of course be the same for all crystalline solids. 
The distinction between various solids comes in the value of @z>. This 
characteristic temperature has been computed for several elements 
both from observed specific heat data and from the experimental 
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elastic constants (by means of 50). The values obtained by the two 
methods agree rather well. 11 For example, the value of @D for alumi- 
num calculated from the observed specific heat variation is 398 K 



(cal pea mole) 



0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 



FIG. 9-1. 

and that obtained from the elastic constants is 399 K. Another 
possibility is to obtain @D by fitting the low temperature range as 
accurately as possible and not considering the higher temperatures. 
If this is done for aluminum the value is 385 K. 12 



PROBLEMS 

1. Use eqs. (6) and (7) to derive the explicit expressions for the specific heat at 
constant volume of an ideal monatomic gas on the Fermi-Dirac and Bose-Einstein 
statistics respectively. Show that to a first approximation the classical formula 
cy = 3k/2m results. Estimate the magnitude of the correction for oxygen at its 
critical temperature, in the Bose-Einstein case. 

2. Derive the approximation (20) for the rotational partition function for an 
ideal diatomic gas. 

3. Prove that a plane harmonic wave of frequency v progressing in a direction 
with direction cosines a, 0, y, has its displacement in the form 



Find the expression for the displacement in the stationary waves which arise from the 
reflection of plane progressive waves of the above type in a cubical vessel of side /. 

4. Calculate the Debye characteristic temperature for copper, silver, and alumi- 
num from the observed elastic constants. Compare the results with the char- 
acteristic temperature for the same metals obtained from the observed specific heat 
values at low temperature. 

11 For details, see Slater, op. cit., p. 237. 

12 For a discussion of recent attempts to improve on Debye's theory by a more 
careful evaluation of the actual frequency spectrum of the crystalline solid, consult 
the book by Mott and Jones, "The Theory of the Properties of Metals and Alloys," 
pp. 6 ff., Oxford, 1936. 
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5. Show that the entropy of a crystalline solid can be written in the form 



Find the result to which this reduces if all 3N frequencies of the crystal are assumed 
to be identical and equal to v. Find the corresponding result on the Debye theory 
of frequency distribution. Show that on the Debye theory at temperatures low com 
pared with the characteristic temperature the entropy can be written approximately 



'-""ilsV*. 

6. Derive the expression for the free energy of a crystalline solid on the Debye 
theory and show that at temperatures high compared with the characteristic tem- 
perature the value is approximately 



CHAPTER X 

QUANTUM STATISTICAL THEORY OF ELECTRICAL AND 
THERMAL PROPERTIES OF METALS 

1. THE LORENTZ-SOMMERFELD THEORY OF ELECTRICAL AND 
THERMAL CONDUCTION 

The most elaborate classical statistical theory of the conduction of 
electricity and heat in metals is due to Lorentz. 1 He attributed both 
effects to the motion of free electrons and treated the electron gas in 
the metal by the classical Maxwell-Boltzmann statistics. Sommerfeld 2 
modified the Lorentz theory by the introduction of quantum statistics. 
We have already noted that an electron gas such as the Lorentz 
theory envisages in a metal with approximately one electron per metal- 
lic atom, must be strongly degenerate even at very high temperatures. 
The distribution law to be used will therefore be that given in eq. (47) 
of Chapter VIII with a = -f 1, and with 71 given by eq. (129) of 
Chapter VIII. We find it convenient to express it in terms of velocity 
components rather than energy. This makes it necessary to rewrite 
the expression for HJ in terms of the velocity components v xj v yj v z in 
place of energy. To revert to Sec. 4, Chapter VIII we see that the 
possible energy values of a free particle in a closed vessel, given in 
eq. (59), are really equivalent to saying that the possible kinetic 
energy values corresponding to velocity components v x , v y , v z along 
the three coordinate axes respectively are 

2/2 



1 H. A. Lorentz, "Theory of Electrons," pp. 267 ff., New York, 1916. 

2 A. Sommerfeld, Z. Physik 47, 1 (1928). See also "Handbuch der Physik,' 1 
second edition, Vol. 24, p. 333. Springer, Berlin, 1933. 

244 



SEC. 1] THE LORENTZ-SOMMERFELD THEORY OF CONDUCTION 245 

This should be clear from eqs. (55) and (58) of Chapter VIII. Hence 
the number of energy values for which v x lies between v x and v x + dv x is 
simply 

dni m 



with similar expressions for the y and z directions. The total number 
of energy values for which v x is in the interval v x , v x + dv x and similarly 
for v y and v z then becomes 3 

3 



anidn^an^ m 7 

= 73- rdv x dvydv z . (1) 

o ft 

If we assign two electrons to each energy value to take care of the spin, 
we have for our new nj the expression 

2m 3 
HJ = g- rdvxdvydvg, (2) 

and the distribution law to be used in our discussion takes the form 

2rm 3 /h 3 dv x dv y dv z ,. 

// /Y .. . i x l 

1 i x,~"yi^ m (^+H"f- 1) *)/2A;r 
1 ~p 6 6 x * z 

For the sake of convenience we shall mainly employ 

dN 2m 3 1 



, 
/o(*. *, *,. O - 

The function / is written as a function of x as well as v x , v y , v z since we 
are assuming that the metal in question is in the form of a rod directed 
along the x axis. We wish to express the fact that the rod may not be 
homogeneous, whence 71 will depend on x. Moreover if a temperature 
gradient exists, J'will be a function of x. Hence the distribution func- 
tion will in general depend on x. 

If the metal is subjected to no external influences the distribution 
function / will remain unchanged at any particular place in the metal. 
We must indeed expect the electrons to suffer energy changes by col- 
lision with the metallic atoms, but on the average as many electrons 
will be expected to gain as to lose energy and the /o should not be 
altered by this effect. However, the situation will be different when an 
electric field <o(x) is applied to the metal. 

8 We revert to the use of r for physical volume. 
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Consider the electrons at a given instant / in a small volume dr of 
the metal at the point x, y, z and with their velocity components in the 
interval v X9 v x + dv x , etc. At the time / + dt, if there were no col- 
lisions, these electrons would be displaced to the same volume around 
the point (x + v x dt y y + v y dt, z + v z df) and their velocity components 
would become v x + edt/m-dt, v y , v z . Note that we are assuming that 
the applied field acts only along the x axis. Now because of collisions 
with the atoms of the metal a certain number of electrons having 
velocity components in the interval in question at time / will have 
passed out of this interval by t + dt. Let this number per second be 
denoted by 

adrdv x dvydv z . 

Similarly we shall assume that 

bdrdv x dvydv z 

have their velocity components brought into the interval per second. 
If there were no external field acting, equilibrium would be maintained 

by 

a = b. 

However in general we shall have when a steady state is established 

& 

f(x + v x dt, y + Vydt, z + v z dt, v x -\ dt, v y , v z ) 

m 

= f(x, y, z, v x , v yi v g ) + (b - a)dt, (5) 

where / is the generalized distribution function in the presence of the 
external field. When the field vanishes/ reduces to/ . Equation (5) 
leads to 

df e df 

b - a = v x + (6) 

ox m dv x 

Af Af 

Because / is a function of x alone, and are absent in (6). 

dy dz 

In order to utilize this equation it is necessary to obtain an expression 
for b a. This involves a study of the collisions of electrons with 
atoms. Let us assume that the atoms are elastic spheres of radius 
R and that the number of electrons per unit volume is equal to n. The 
collisions of the electrons with the atoms are then elastic. In Fig. 10-1 
we have indicated such an atom with center at 0. Construct the solid 
angle dco at subtending the surface element dS = R?du at the surface 
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of the atom. Consider the electrons with velocity v in the velocity 
interval v x , v x + dv x , etc. The number of these which strike dS per 
second is the number in a cylinder of length v and base area dS-cos j3, 




FIG. 10-1. 

where ft is the angle between v and the inward drawn normal to dS. 
Consequently the number per unit volume is 



cos )8 vf(x, y, 2, v x , v 



yi 



Hence the average number of collisions per unit volume per second 
suffered by the electrons in the given velocity interval would appear to 
be 



dv x dvydv z I nR 2 v cos /3-/(#, y, z, v x , v y , v z )do). 



(7) 



This assumes that all the collisions are possible, but this is true only if 
the gas is one obeying the classical Maxwell-Boltzmann statistics. In 
the Fermi-Dirac gas, which we are considering here, only those 
collisions are possible for which the final state was originally empty. 
Consequently we must correct (7) by writing it in the form 

dv x dv y dv z I 1 - 3 -/(#,:y,2,z4Vz) \ 



\nR 2 v cos 



where v' XJ v' y , v' z are the component velocities after collision. The num- 
ber of collisions for which /(x, y y z, v x , v y , v z ] = 2m 3 /h 3 then becomes 
zero. The integral in (7 ; ) is the expression for a in (6). 

Since we are assuming that the collisions are elastic we can calculate 
the velocity components after collision by the law of reflection on 
elastic impact. Thus, the velocity after impact being v' with com- 
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ponents v x , v' y , v z (where | v' | = | v | ) the projection of v' v on the 
normal to the surface element dS is 2v cos /?, and hence if the direction 
cosines of the normal are X, /*, v, we have 

v x v x = 2v cos 0-\ 

v v v v = 2*>cos 0-/A (8) 

v z v' z = 2v cos fi-v. 

This assumes, of course, that the atoms of the metal are stationary. 
Since they are so much more massive than the electrons, this assump- 
tion is reasonable. Similarly if the original velocity components are 
v' x , Vy, v' g the final ones after impact will be v x , v y , v z as given in (8). 
Hence the expression for b in (6) is simply the integral 

- - f(x, y, *, v x , v y , v 3 ~\nR*v cos 0-/(a, y, z, v x , v y , v z )du (9) 

where the coefficient [1 f(x, y, z, v x , v y , v z )] is inserted for the same 
reason as in (V). Let us now assume that the distribution function as 
altered by the field can be expressed in terms of the original distribu- 
tion function /o by means of the relation 

/ = /o + v x -x(x, i>x, v y , v z ), (10) 

where x like/ , is assumed to be a function of the velocity components 
through the magnitude v v x + v 2 y + 1% only. This assumption is 
rendered plausible by the recollection that the field is directed along the 
x axis and hence should be expected to change/ principally through v x . 
Equation (6) now takes the form 



m 
= nR 2 vx(x, v)f(v' x - v x ) cos^ Jco, (11) 



if we assume that x(# v xt v' v , v z ) x(x, v xy v y , v z ), i.e., the total energy 
of the electron remains unchanged by the collision. If we consider x as 

dx 

a sort of correction function it will be proper to neglect v x compared 

dx 

with and x + v x compared with -- Moreover we shall write 
dx dv x dv x 



SEC. 1] THE LORENTZ-SOMMERFELD THEORY OF CONDUCTION 249 

= Finally we use the first equation in (8). Then eq. (11) 

dv x v dv 

becomes 

cos 2 /3</w. (12) 



dx mv dv 

We must now evaluate the integral on the right side. Let a be the 
angle between the normal to the atomic surface at dS and the x axis. 
Thus X = cos a. We now take the velocity v as the polar axis of a 
system of spherical coordinates in which ft is the co-latitude angle 
and < is, as usual, the longitude. By the application of the usual rule for 
the cosine of the angle between two lines we have 



v * **^AA * * 
X = cos a = cos ft H sin ft cos <. 

V V 

Then 

dS 

do) = - o = sm ft dft d<f>. 
K 

Consequently the integral in (12) becomes 



/ X cos 2 ft dw = / sin ft cos 2 ft dft. 




where the upper limit of the ft integration is ir/2 instead of TT since we 
are clearly entitled to count in our calculation the solid angle for a 
hemisphere only. The term involving v v 2 y + v 2 z goes out in the inte- 
gration and we are finally left with 

Xcos 2 ftdu = 

2v 

The resulting differential equation for/ then becomes 



v v 

In line with our classical kinetic theory discussion in Sec. 3 of Chapter 
V, it is plausible to define the effective mean free path of the electrons, as 
far as collisions with the atoms of the metal are concerned, as 
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With this definition we can write (13) in the form 



. 
x=--l H -- ) (15) 

v \dx mv dv / 

It should be pointed out that it is unnecessary to take the definition 
(14) too seriously. We could carry the quantity l/irnR 2 through our 
analysis without giving it a special name if we chose. As a matter of 
fact, the theory being presented here is a formal one in the sense that 
we shall not attempt to calculate / in a precise manner. The quantity 
1/vnR 2 has the dimensions of length and indeed if we insert reasonable 
values of n and R, e.g., for silver, n = 5.9 X 10 22 , R~ 10~~ 8 cm, / 
comes out of the order of 5 X 10"~ 8 cm, which is not unreasonable. It 
will later prove possible to deduce formulas from which / disappears by 
cancellation. We shall naturally be able to attach greater significance 
to such equations as tests of the theory. 

We are now ready to evaluate the electric and thermal current 
densities in the direction of the x axis. The electric current density or 
rate of flow of charge per unit area per second is by definition 



J = e I I I v x fdv x dv y dv z . (16) 

x y y 

00 

The thermal current density or the total rate of transfer of kinetic 
energy per unit area per second is 

+ 00 

C = ~ / / / v 2 v x f dv x dv y dv z . (17) 

oo 

Substituting / = /o + v x x and recalling that / is an even function of 
v x , v yi v z , we get 

+00 

/ = e I I I v 2 x xdv x dv y dv g . (18) 

00 

Similarly 

+_ 

r* "". 

00 

When we proceed now to spherical coordinates, we replace dv x dv y dv z by 
v 2 sin dv dd d<t>, with v 2 = v x + v% + v* t the angle v makes with the v t 
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axis and $ the angle its projection on the v x v y plane makes with the v x 
axis. Integrating out the 6 and <t> parts gives 



(20) 
and 

S.irWt. I 

(21) 

The next step is to substitute x from (15) into these expressions. The 

result is 4 [~ /** flf /* /* df 1 

/=--H / Iv 3 ~dv + I Iv 2 -~dv , (22) 



An integration by parts performed on the second integral in each 
expression yields 



3 



2^ /- .w _/- , i 

3 L m Jo J dv J dx J v 

It is convenient at this point to introduce the change in variable, 

*\ 

mv 2 /2kT = u. Then the operator becomes 

dv 

d _mv d 

dv "~ kT du ' 

while . 

, a d 

dv = du -- 
dv du 

kT 
Now for convenience let lv 2 L, whence lv 3 dv = L du. Moreover 

m 

2kT 
fo 4 = v 2 L = - uL, etc., so that the expressions (24) and (25) now 

m 

appear as 

etf r f dL j kT C* df T j 1 ,_, 

/ fo~du -- / Ldu\, (26) 

m JQ du m Jo dx -I 



_ 
j = 

3 

2^ . ar f w , u _i(ry /"^ (27) 

3 Lm m JQ J du 2\ m / Jo dx l 
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/o given in eq. (4) takes the form 

. 2m 3 1 



- 70 h 3 1 + *~ 71+tt 
Hence on the assumption that in general both 71 and T vary with x 



dx du \dx T dx 



We can integrate / Ldu by parts and find it equal to 
Jo du 

rdL 
/o du, since the integrated part vanishes at both limits. 
du 

Similarly 

rr)f /^ <9 

f-u!M=- / f -(uL)du. 
du J du 

Finally, if we abbreviate by setting 

r. dL r* d(uL) 

I f du = Ai, I f ~ du = A 2 , 

Jo dU JQ dU 



the two current densities become 

r ^ e A \ /? ^^1 *, <\ 

J = - AI\ ee kT -- k -- T- (30) 

3m L dx dx 



C = ^-A<Aeg- kT^ - k *-? (31) 

3m L dx dx A 2 J 

We are now ready to apply the formulas (30) and (31). If the medium 
is homogeneous and there exists no temperature gradient, (30) reduces 
to 

(32) 



3m 

The electrical conductivity is defined as the reciprocal of the specific 
resistance, which is the ratio of the electric field intensity to the current 
density. Hence for the conductivity, <r, we have 

J 



(J = ~^> 



3m 



(33) 
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From (29) 

_ 2kT S" d(lu) 

A\ I JQ ~ (LU. w^J 

m J Q du 

We must assume that / is a function of u, though we do not know the 
precise dependence. Let us expand ul(u) in a power series : 



ul = +1 , (35) 

and 



Substitution into (34) yields 

9Ty ^ 

= - - f u>'du. (36) 



/ 
f Q u 3 du as identical with U'(j 9 71) in eq. (122) 

of Chapter VIII. The first approximation yields 



which from (35) becomes 

UT 2m 3 

(37) 



where /(TI) means the value of l(u) with 71 inserted for u. The con- 
ductivity to the first approximation takes the form 

(38) 

Note from eq. (138) of Chapter VIII that /(7i) is the value of the mean 
free path corresponding to the velocity VQ. The utility of this formula 
for electrical conductivity is severely restricted by our ignorance of the 
dependence of / on the velocity. By using the first approximation to 
the degenerate form of 71 from (128) of Chapter VIII with 2r in place 
of r we can rewrite (38) in the form 

(39] 
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In this formula the temperature no longer enters explicitly. The 
temperature variation of conductivity must therefore be sought in that 
of N/r and of the mean free path /(TI). Since the change of N/r with 
temperature is a matter of thermal expansion only, it is extremely 
slight and scarcely competent to account for the change in <r. Hence 
the burden is laid on /(TI). Calculations based on (39) using the 
experimental values of <r indicate that for copper, for example, on the 
assumption of one free electron per atom / will vary from about 
7 X 10~ 7 cm at 1000 C to about 4 X 10~ 5 cm at -200 C These 
figures are larger than the value 5 X 10~ 8 cm computed from the 
simple formula 



, 

~ 



with 10~" 8 cm for R. This is an indication of the formal character of 
the theory developed in this section. The assumption that the atoms 
are perfectly elastic spheres is clearly not a particularly good one, 
unless we are willing to assign to the radius of such spheres a smaller 
value than 10~ 8 cm. This is, of course, a possibility; the effective 
radius of an atom for a collision may be much smaller than the dimen- 
sions of the core of the atom from the quantum mechanical standpoint. 
The variation of / with temperature is another matter untouched by the 
formal theory. 

Let us next proceed to the thermal conductivity. In eq. (30), 
suppose that .7 = 0, i.e., zero electric current flow. The result shows 
that there must still exist an electric field intensity as long as a tem- 
perature gradient exists or the metal is non-homogeneous. Its value 
is given by 



& ,~ , , //inx 

e6 = kT + k - (40) 

dx dx AI 

We substitute this into the expression for the heat flow (31) and obtain 

dT\A a A 3 ] 

-' 



The thermal conductivity K, which by definition is the ratio of C to the 
negative temperature gradient, becomes 



(42) 
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To evaluate this we must compute A 2 and A%. From (29) we have 




We now fall back on the expansion (35) for ul. By the use of this, (43) 
becomes 

u. (44) 

We now revert to the expression for U' in eq. (122) of Chapter VIII, 
and can write to the second approximation 




2kT _.., 

A * - -^ ' -^ 



71 



Next we must get ^3 as 



By proceeding as before, we have to the second approximation 




3 = 

2kT 



m n 

We are now ready to evaluate K. In the ratio A 2 /A\ we need to 
consider only the first approximation to A\ 9 namely, that given in 
(37), since simple inspection shows that the terms in the next approxi- 
mation involve derivatives of / only, which we shall neglect throughout. 
However, in A 2 and A$ we must consider that part of the second 
approximation which does not involve the derivatives. We finally 
obtain 

4i 
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and 

,8c 2 

= 7iH 

71 



Consequently the heat conductivity becomes (with ir 2 /12 for 



71 9h 3 

Let us now take the ratio of K to aT. The result is 

K 167r 3 & 3 7' 2 ra 3/j 3 7T 2 k 2 

aT 9h 3 167rme 2 kT 2 3 e 2 



(47) 



(48) 



We have thus derived the well-known Wiedemann -Franz law connect- 
ing the electrical and thermal conductivities of metallic conductors. 
If we use the electromagnetic unit of charge for e, the value of the 
constant on the right side of (48) turns out to be approximately 



ergs' 



3 e 2 (degree C) 2 (emu) 2 

At room temperature (293 K) we have 



(49) 



- = 7.15 X 10 10 
a 

in absolute units. This agrees rather well with the mean of the experi- 
mental values for such metals as Al, Cu, Ag, Ni, Zn, Cd, Pb, Sn, Pt, 
Pd, Fe. The following table (taken from H. Frohlich, "Elektronen 
Theorie der Metalle," Springer, Berlin, 1936) shows the nature of the 
agreement. It gives the experimental values of K/<rT for two different 
temperatures. 



<rT 



xio- 



Metal 


Cu 


Ag 


Au 


Zn 


Cd 


Pb 


Fe 


291 


2 28 


2 36 


2.43 


2 31 


2.42 


2.45 


2.88 


373 


2 32 


2 37 


2.45 


2.33 


2.43 


2.51 


3.00 



Of these metals, iron is the only one to deviate notably from the 
theoretical value (49). In general the values for various steels show 
considerable increase with temperature. There are indeed more 
notable exceptions to the law. The constant for rhodium, for example, 
is only 1.33. In general, however, the law may be taken to be well 
established. 
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It should be pointed out that though the classical statistical 
theory of Lorentz also leads to a law of the form (48), the value of the 
constant comes out to be 2k 2 /e 2 in place of ?r 2 /3 k 2 /e 2 . This is in 
definite disagreement with the experimental values for most metals 
measured and constitutes a serious objection to the Lorentz theory. 
The reader may show that, as might be expected, the Lorentz result 
can be obtained from the quantum statistical theory above outlined 
if one assumes a non-degenerate gas. This makes clear the essential 
failure of the Lorentz theory from the standpoint of quantum statistics. 

At the same time it must be emphasized that the Sommerfeld 
application of the Fermi-Dirac statistics is by no means a complete 
picture. It dodges entirely the fundamental question of the mean free 
path of the electrons and treats the interaction between the electrons 
and the atoms of the metal in a purely superficial fashion. Only a 
thoroughgoing application of quantum mechanics can overcome these 
defects. As a first approximation, at any rate, the Sommerfeld theory 
must be viewed as a successful application of quantum statistics. 

2. HEAT PRODUCTION IN A CONDUCTOR. THE THOMSON EFFECT 

The work done per second by the electric field <^when an electric 
current / flows through the length dx of a conductor is Sldx = $JSdx, 
where S is the area of cross-section. The power expended per unit 
volume is therefore J& Denoting, as usual, by C the rate of flow of 

heat per second per unit area, the rate of increase in heat content per 
\ /~ 

unit volume is --- Hence the net rate of heat production per unit 
volume is 

. (so) 



From eq. (30) we can solve for $ in terms of /, etc. Thus substituting 
for A i in terms of the electrical conductivity <r from eq. (33), we have 

kdTA* kTdy, 

6 = J/ff + - -7- H --- i (51) 

e dx AI e dx 

The next step is to substitute this expression for <^in that for C 
(eq. 31). The result is 



1m 
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From eq. (42) we can solve for I -~ ~ ) in terms of K and get 

Y4.2 A>\' 



2 A 
whence eq. (52) can be written 



-K (54) 

The quantity Q therefore becomes 

k ^ ekj d ( TA *\ (t> ^ 

IOO J 

or, using (51) again, 

= Z! + A f *!\+ JkT d ( - 4*\ 

This is a very interesting expression. We recognize at once the term 
J 2 /cr as the Joule heat production rate per unit volume. The second 

n / JiT r '\ 

term, ( K ), is the rate of heat production per unit volume due to 
dx \ dx/ 

heat conduction. It is clear that we must attribute the last term in 
(56) to the thermoelectric effects. Let us review these briefly. 

We consider first the Thomson effect. If a current flows in the 
conductor in the direction of the positive x axis and if a temperature 
gradient dT/dx is maintained in the conductor, it is found that for 
certain metals in the part of the conductor in which the temperature 
gradient is positive, heat is absorbed by the metal, while in the part 
where the temperature gradient is negative, heat is evolved by the 
metal. This is true, for example, in the case of copper, which is said 
to exhibit a positive Thomson effect. On the other hand, there are 
metals in which positive dT/dx is accompanied by the evolution of 
heat and negative dT/dx by absorption of heat. Iron is an example of 
this, and the effect is known as the negative Thomson effect. In lead 
the effect is so small that it is generally considered negligible. The 
Thomson heat, unlike the Joule heat, is reversible. 

Lord Kelvin (for whom the effect is named) discovered by experi- 
ment that the thermoelectric heat energy evolved or absorbed per 
second per unit volume of the conductor, i.e., the heat additional to the 
Joule heat, is directly proportional to the product of the current 
density and the temperature gradient. The coefficient will be denoted 
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by fj> and called the Thomson thermoelectric coefficient. Thus if we denote 
the Thomson heat by QT, we have 1 

Qr = vJ^> (57) 

From eq. (56) it follows that for a homogeneous metal 

kT d 

M=~^(Yi-V4i). (58) 

Substituting the value for A 2 /Ai given just after eq. (46), we obtain 



We must now insert the value of 71 for a degenerate electron gas 
given in eq. (128) of Chapter VIII (the first approximation being 
sufficient). We replace r by 2r, as usual. Carrying out the differ- 
entiation yields 

* 

(60) 



The slight variation of r/N with temperature is here neglected. We 
now discuss the comparison with experiment. If we substitute into 
(60) the value of N/r = 5.9 X 10 22 appropriate for silver and take 
r = 300K, etc., we get approximately n = 1.5 microvolts/ C 
(equivalent to 150 ergs/emu degree C). The experimental value of the 
Thomson coefficient for silver is 1.2 microvolts/ C, and in general the 
experimental values run between 1 and 10 microvolts/ C. Hence 
there is general order of magnitude agreement with experiment. This 
is interesting since the original Lorentz theory, which is equivalent to 
the present theory for the non-degenerate or very weakly degenerate 
case, gives much higher values, indeed of the order of 100 micro- 
volts/ C. In fact, the reader can show by the use of the non-degener- 
ate forms of 71 and A 2 /Ai in (58) that the magnitude of p, on the 

3 k 

classical Lorentz theory is the universal constant - - , which should 

2 e 

1 There are many so-called Thomson coefficients, only one of which is presented 
here. It is the one which is ordinarily measured. For a fuller discussion, see Som- 
merfeld and Frank, "Statistical Theory of Thermoelectric, Galvano- and Thermo- 
magnetic Phenomena in Metals," Reviews of Modern Physics 3, 1 (1931). A thorough 
discussion of the thermoelectric effects from the standpoint of thermodynamics will 
be found in P. W. Bridgman, "The Thermodynamics of Electrical Phenomena in 
Metals," Macmillan, 1934. 
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then apply to all conductors. There is one "fly in the ointment/ 1 to be 
sure, namely, the fact that the Sommerfeld theory in the simple form 
here presented is no more able than the Lorentz theory to distinguish 
between the positive and negative Thomson effects, i.e., we can cal- 
culate | M | only. 

3. THE PELTIER EFFECT 

We next consider a non-uniform conductor maintained at uniform 
temperature. In eq. (56) the term in dT/dx accordingly drops out and 
the heat production due to the flow of current through the non- 
homogeneous medium is given by the last term with the understanding 
that T remains constant while the medium changes its character with 
x. Let us suppose we have two different metals in "contact/ 1 which is 
taken to mean that they are separated by a very narrow transition 
layer. We shall define the heat produced per unit area of this layer 
per second by unit current density as the Peltier heat. It is denoted 
by 7ri2, the subscripts referring to the two metallic media. Therefore 
we have 

7T12 



w rd\ A 2 ] kr 
= / T- h^ - -T- u* = 

e J Xl dxL Ail e 



where x\ and #2 refer to the boundaries of the transition layer in the 
two metals respectively. Utilizing the expression for the bracket 
already employed in eq. (59) we obtain 



< 62 > 



fl"i2= 1 = Tl2 IT) V ) I / ' 

3^71 1 XI Zeh \3/L\T/2 \r/i J 



where (N/r)i and (N/r} 2 refer to the values for the first and second 
metals respectively. The calculation for the case of the copper-silver 
couple at T = 300 K, yields approximately 100 microvolts. The 
agreement with experiment is not startlingly good, since the experi- 
mental value turns out to be about 30 microvolt, the minus sign 
signifying the evolution of heat as the current goes from Cu to Ag. 
Fortunately for the Sommerfeld theory, the result calculated on the 
old Lorentz theory is even further out of the way, being around 9,000 
microvolts. It is perhaps well to emphasize that the experimental 
values are not too certain, rendering the comparison between theory 
and experiment rather precarious. 
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4. THERMOELECTRIC ELECTROMOTIVE FORCE. 
THE SEEBECK EFFECT 

We next attempt the calculation of the actual thermoelectric emf 
in an open circuit. 

Consider two different metals 1 and 2 (Fig. 10-2) with junctions 
PI and P. We wish to find the potential difference between A and B, 
maintained at temperature T, while PI and P 2 are kept at temperature 

A 1 P 1 2 P 2 l B 

fjn ml rnll rp 

FIG. 10-2. 

T' and T" respectively. Since no current flows, eq. (40) holds for the 
field intensity and therefore there exists a difference in potential 
between A and B which has the value 



V AB = <?d X = - . dx + - T dx . (63) 

JA e JA dx A l e J A dx 

We must evaluate this for the degenerate case and hence use A 2 /Ai = 
Ti + 7r 2 /37i- This yields 

2 *j M 

dx. (64) 



k C B l + " 2 \JT+ k C 

= - / I7i+ )dT + - I 
e JA \ 37i/ e J A 



dx 



C B k 

Integration by parts of the second integral yields I -- JidT, whence 

JA e 

dr _ ^ r fdr r T "dr r T d 

TI ~ 3e [J T Tll + J r 7 12 + Jr> 



where 7u is the value of 71 for the first metal and 7i 2 that for the 
second. The use of the first approximation for 7 X (eq. 128 of Chapter 
VIII) leads to 

v 

v AB 



3e Jr \7i2 7n 
* 2 k 2mk 



Furthermore if we neglect to a first approximation the dependence of 
T/N on r, the approximate result is 



V AB = 



3eh 2 
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This agrees much better with experimental results than the correspond- 
ing formula for the classical Lorentz case which the reader may show is 
in the form 



j^assical = - | fog I - J - log I - J | (T" - T). (68) 

Thus for the case of a silver-sodium thermocouple substitution into 
(67) gives for T" - T = 1 degree C at C 

VAB 1.0 microvolt, 

whereas the classical formula yields for the same couple under the same 
conditions 72 microvolts. The experimental value is approximately 3 
microvolts. The reader should work out other cases for himself. 
It may indeed be shown that (67) can be put into the form : 

VAB = 546K(*" - /') l + i ' (/// t ^ (69) 



where / represents centigrade temperature and K is a constant depend- 
ing on the couple. This is in line with the actual form of the experi- 
mental observations, though the numerical values do not agree in 
every instance. 

5. TRANSVERSE GALVANOMAGNETIC AND THERMOMAGNETIC 

EFFECTS 

Less well known than the thermoelectric effects just discussed are 
those produced in a conductor through which heat or electricity is 
flowing by the presence of a magnetic field. The phenomena par- 
ticularly studied are those in which the magnetic field is applied at 
right angles to the current or heat flow. Chief among them is the Hall 
effect in which the magnetic field applied transversely to a current- 
carrying rod produces a transverse potential difference at right angles 
to both field and current which is directly proportional to the product 
of the intensity of the magnetic field and the current strength and 
inversely proportional to the transverse width of the conductor. The 
proportionality constant is the so-called Hall coefficient. There are 
three other similar effects : (a) Ettingshausen effect, or the production 
of a transverse temperature difference by the interaction of transverse 
magnetic field and electric current flow, when the conductor is ther- 
mally insulated ; (b) Nernst effect, or the production of a transverse 
potential difference when heat flow takes place lengthwise of the con- 
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ductor; and (c) Righi-Leduc effect, or the production of a transverse 
temperature difference when heat flows lengthwise of a thermally 
insulated conductor. 

All these effects constitute transport phenomena which can be 
treated by the theory of this chapter. The fundamental equation (10) 
must be generalized to take care of an additional dimension as well 
as the force on the electrons due to the magnetic field. The resulting 
transport formulas for / and C are analogous to (30) and (31) respec- 
tively but naturally rather more complicated. We shall not present 
the analysis or the results here 5 but merely remark that the agreement 
with experiment is reasonably good and of the same general character 
as that already encountered in the thermoelectric phenomena. One 
interesting byproduct of these studies is the change in effective resist- 
ance of a conductor due to the applied magnetic field. 



6. MOTION OF ELECTRONS IN A CRYSTAL LATTICE. 
SIMPLE TREATMENT 

The Sommerfeld theory employed in the preceding sections assumes 
that the conduction electrons are free or at any rate move in a field of 
constant potential. It takes account of the interaction of the moving 
electrons and the atoms or ions of the metal only from the classical 
and formal standpoint of elastic collisions. Actually the potential field 
in which metallic electrons move is by no means uniform and a more 
thoroughgoing application of statistics to the electrical and thermal 
properties of metals must take account of this fact. Much recent work 
has been done in this field 6 but a thorough survey of it lies outside the 
scope of the present volume. However, a simplified special case may 
prove of some interest as an introduction to the modern theory of 
metals. 

In order to employ a statistical distribution formula like eq. (47) of 
Chapter VIII it is essential to know the possible energy values Ej and 
the number of such values in any given energy interval. In our previ- 
ous discussion we have used the values appropriate to a set of free 
electrons. We now wish to see how they are modified by the motion of 
the electrons in a force field. The actual field of force encountered in a 
crystal lattice is a complicated affair. Nevertheless if the metal is in the 
form of a single crystal, the regular arrangement of the atoms assures 

5 A discussion will be found in the article of Sommerfeld and Frank already cited. 

8 Cf. Mott and Jones, "The Theory of the Properties of Metals and Alloys," 
Oxford, 1936. Also cf. A. H. Wilson, "The Theory of Metals/ 1 Cambridge, 1936 
and F. Seitz, "The Modern Theory of Solids," McGraw-Hill, 1940. 
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that the field will be of periodic character in space. We therefore 
examine the motion of electrons through a periodic potential field. 
The simplest case of this kind is the one-dimensional lattice shown 
schematically in the following diagram (Fig. 10-3) which plots the 
variation of the potential energy V(x) with x. In the hollows of the 
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FIG. 10-3. 



square saw-toothed curve the potential energy is assumed to have the 
constant value VQ and in the crests the constant value V\. For 
simplicity we suppose the curve extends from oo to + oo . The 
widths of the potential energy regions are 2/ and 2/ x respectively. 
This is, of course, a highly idealized picture but it does provide a simple 
type of periodically varying potential energy. 

The Schrodinger equations (cf. eqs. 3 and 52 of Chapter VIII) for 
the two regions where the potential energies are respectively VQ and V\ 



are 



+ 



( - 



= (hollow) 



(70) 



<** 2 



= (crest). 



(71) 



If E < F aswellasjE < FI, it is clear that the solutions of both equa- 
tions will be exponential functions with real exponents. Such func- 
tions, however, cannot correspond to genuine transmission of electrons 
through the structure, for the terms in which the exponents are nega- 
tive will more or less quickly go to zero as x increases and hence corre- 
spond to very small probability of finding electrons very far to the 
right of any chosen origin, while the terms in which the exponents are 
positive will go to infinity as x increases, which has no meaning in 
quantum mechanics. We must therefore choose E > VQ at least, 
though it will develop that we can secure transmission under certain 
conditions if E < V\. In fact the latter situation is more interesting 
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than E > V\. Let us therefore discuss the problem for V\ > E > V& 
Let 

87T 2 w ,_ Trx ,2 
^ (E - Fo) = kl 



The solutions of (70) and (71) become respectively (cf. eqs. 56 of 
Chapter VIII) 

* = .lo** * + B e~ ikQX (hollow), (73) 

* = A ie klX + B ie ~ klX (crest), (74) 

where the arbitrary constants A Q , B 0> A\, B\ are in general complex. 
The transmission of electrons through the lattice is handled by setting 
up the boundary conditions expressing continuity in \[s and d\p/dx at 
each boundary surface. Differentiating and using the prime notation 
for differentiation with respect to x we get 

*' = ik (A e ikQX - Boe~ ikQX ) (hollow), (75) 

- Bie-**) (crest). (76) 



We shall now denote the midpoints of successive hollows by 1, 3, 5 
and proceed to express the \l/ and i^' values at any point in a hollow in 
terms of their values at the midpoints denoted by the subscripts 
1, 3, 5 . Thus we get rid of the arbitrary constants by setting 



& = ik (A - ), 

whence ^ and \f/' for any point between 1 and the next boundary sur- 
face become 

*1 
\l/ = \l/i cos k Q x + sin k Q x, 

KQ 

(hollow) (77) 



= ^i cos k x 



sn 



We now denote quantities at the first boundary at the left by the 
subscript combination 12 (see Fig. 10-3) and at the right by the com- 
bination 21. Therefore there results 



= ih cos kolo + sin k^, 

kQ (78) 



12 = ^i cos ^o^o ~ ktfb\ sin 



266 ELECTRICAL AND THERMAL PROPERTIES OF METALS [Cn. X 

In similar fashion the ^ and V at any point in the first crest region will 
be given in terms of fai and ^21 immediately at the right of the bound- 
ary by 

^ = \f/ 2l cosh kix + sinh k\x 9 
ki 

(crest) (79) 

\I/' = ^21 cosh k\x + k\^2\ sinh k\x. 



Hence (to consult the diagram once more) 



21 

23 = ^21 cosh 2&i/i + sinh 2k\l\, 

1 (80) 

23 = ^21 cosh 2&i/i + #1^21 sinh 2k\l\. 



Finally, we can employ (77) again to write 

^32 



= 32 COS 

(81) 

^3 = fe cos /o "" ^0^32 sin ^o^o 

The boundary conditions are 

^12 = ^2i; ^12 = ^21 

(82) 

^23 = ^32; ifes = ^32- 

Our task now is to use eqs. (78), (80), (81), and (82) to express ^ 3 and 
^>3 in terms of ^i and ^i. The resulting equations should hold for the 
state functions and their gradients at any two successive mid-hollow 
points and therefore tell the general story of the electron transmission 
through the periodic potential field. If we utilize the boundary condi- 
tions, eqs. (80) and (81) become 

\^12 

^23 = ^12 cosh 2&i/i + sinh 2k\l\, 

kl (80') 

^23 = ^12 cosh 2&i/i + ^1^12 sinh 2kil\ t 
and 

^23 

^3 = ^23 cos k l Q + sin 



cos Q o oz sn 
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Eliminating ^i 2 , ^12* ^23 and ^23 from the six equations (78), (80'), 
and (81') yields finally 

&j = ti I cosh 2&i/ r cos 2 /o + ( 7T "" "JT ) sinh 2j Wi' sin 2k l 
+ ^ cosh2Jfe 1 / 1 .sin2Wo+sinh2A 1 / 1 (^sin 2 Wo + T 2 cos 2 ^o)l 

KQ\- \<) KI /J 

(83) 

^3 = ^1 cosh 2kik -cos 2^o + ^ ( 7^ T 2 ) sinh 2i/i -sin 2k l 
L i \KO KI/ J 

- k ti cosh 2*i/i sin 2k l Q - sinh 2*i/x ( - 1 cos 2 /o+ ^ sin 2 Wo ) . 
L \k ki /J 

(84) 

Because the structure we are considering is assumed to extend to 
infinity in both directions the relations (83) and (84) will hold for any 
two successive mid-hollow points and we can therefore write for any 
integral j 

D 

= Wj + r $* (85) 

&o 

= Afy - koCfy, (86) 

where the A, B, Care the bracket expressions in (83) and (84). Exam- 
ination discloses that 

A 2 + BC = 1. (87) 

Hence we may introduce the angle W such that 

A = cos W = cosh 2*i/i -cos 2^o + ^ ( "T ~~ T 5 ) sinh 2*i/i -sin 2k Q L. 

2. \KQ RI/ 

(88) 

Ifi 1C 

B = J- sin W; C = J- sin PF. (89) 

C ' x> 

Then (85) and (86) can be written 

= ^ cos W + ~ ^ * sin W, (90) 



= ^ cos W-k - *j sin PT. (91) 
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From the symmetrical nature of the infinite structure we are able to 
write for every j 

tj+2 = *Vy, (92) 

where 6 is independent of j and is a function of the energy and physical 
parameters of the structure only. Then 

^+2 - *% (92') 

This means that 

ifi +2 /*y+2 = *y/*y = z > (93) 

where Z is a constant independent of j. If we now divide (91) by (90) 
we obtain after a little reduction 



z = **oV^- < 94 > 



c 

The substitution of fy = ik Q \~ <A> in (90) then yields 



W, (95) 

which on comparison with (92) shows that 

6 = W. (96) 

If now Wis a rea/ angle, e tTr has the absolute value unity and ^y +2 and 
^y differ only in phase but not in magnitude. Here there is complete 
transmission of electrons through the periodic structure, since |^y| 2 
measures the average density of moving electrons at the jth mid- 
hollow point. On the other hand, if Wis a complex angle \l/j +2 differs 
from \[/j in magnitude as well as in phase and there is a decrease in 
| \l/ 1 2 as one goes from one mid-hollow point to the next. This means 
that when W is complex there is no transmission of electrons through 
the structure. Consequently we can look on cos W as a function 
characterizing the possibility of transmission. For cos W is real 
(cf . eq. 88) and hence we have transmission when 

| cos W | < 1, (97) 

and no transmission for 

(cos W | > 1. (98) 

If we plot cos W as a function of the energy E from eq. (88) there will 
be certain energy ranges for which (97) will be true and electrons of 
these energies will be transmitted through the metal represented by 
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the periodic model in question; these will be separated by other 
energy ranges for which (98) is true and electrons of such energies will 
be denied transmission through the metal. In other words we can 
look upon the structure with the periodically varying potential field as 
a kind of filter for electrons. It is indeed analogous to electric and 
elastic wave filters which have been studied extensively. 7 

The energy bands for which | cos W \ ^ 1 give the allowed energy 
values for electrons which can actually move through the lattice, i.e., 
those electrons which are of interest for conduction of heat and elec- 
tricity. 

In order to plot these energy bands, let us go back to the expression 
for cos W in eq. (88). When E = F , k Q = and &i = VSw 2 m/h 2 
V V\ VQ. Let us suppose for simplicity that l\ <3C /o and indeed 
to such an extent that we can replace cosh 2k\l\ by 1 + 2k\t\ and 
sinh 2k\l\ by 2k\l\ to a sufficiently close approximation. Then cos W 
becomes 

cos W = (1 + 2k\l\) cos 2&o/o + ( ~ ~ r ) *i'i sin 2 ^o. (99) 

\KQ K\l 

When ko = 0, this reduces to 

cos W = (1 + 2k\l\) + 2^/i/o, (100) 

which is certainly greater than unity. Consequently for electrons with 
energy in the neighborhood of F there is no transmission through the 
structure. On the other hand when the energy has increased suffi- 
ciently so that 2k lo = 7T/2, cos W becomes 

cosW = (~ -fWi' (101) 



which can be smaller than unity in absolute value. Thus, to take an 
illustration in which the dimensions are reasonable for a metal crystal 
lattice, let 2/ be of the order of 5 X 10~ 8 cm and 2/x of order 10~~ 12 cm. 
Let F be of the order of 1 electron volt, while V\ is of order 10 to 100 
electron volts. Then cos W in (101) will be much smaller than unity 
and transmission is assured for the energy corresponding to 2k lo = 7r/2. 
When the energy is great enough so that 2k l Q = TT, cos W again 
becomes greater than unity, taking the form 1 + 2k\l\ as may be seen 
from (99). The result is that as E increases from F to FI, cos Woscil- 

7 For elastic wave and acoustic filters with the theory treated in mathematical 
analogy with the foregoing, see R. B. Lindsay, "The Filtration of Sound," Parts 
I and II, Journal of Applied Physics 9, 612 (1938) and 10, 680 (1939). 



270 ELECTRICAL AND THERMAL PROPERTIES OF METALS [CH. X 

lates in a manner shown graphically (of course only in qualitative 
fashion) in Fig. 10-4. It will be seen that as E increases, cos W exceeds 
unity at periodic intervals given approximately by 



n = 0, 1, 2 



(102) 



For small energy ranges in the vicinity of the values corresponding to 
(102) there will be no transmission. These regions are indicated by 
cross-hatching in the figure. The corresponding energy values are 
given by 

E-Vo = ^> (102') 



32mll 

It will be noted that the forbidden regions grow progressively nar- 
rower in width as the energy increases. This can be seen on exam- 




FIG. 10-4. 

ination of (99). The unshaded regions in Fig. 10-4 correspond to the 
transmitted energy bands. The criterion (102) for the non-trans- 
mitted energy values has an interesting significance connected with the 
possibility of interpreting ko in terms of the de Broglie wavelength of 
the electron. For since the latter wavelength is X = h/mv, we see 
from (72) that 

- 
X 

and hence the wavelengths not transmitted are those in the immediate 
vicinity of 



, 

A = 



n 



SEC. 6] MOTION OF ELECTRONS IN A CRYSTAL LATTICE 271 

If we used the parlance of electrical and elastic wave filters we should 
call the metal lattice here being considered a high-pass electron filter. 
Of course it is also a band-pass filter as well. 

The question may be raised: What happens when E > Vi? The 
solutions then become harmonic in both crests and hollows. This 
can be readily accommodated to the solution above by merely taking 
ki pure imaginary. Thus we set 



where k{ is real and positive. Then we have 

cos W = cos 2k(li cos 2&o/o - ( T 1 + 7? ) sin 2k[li sin 2 k Q l Q . (104) 

2 \KQ KI/ 

If 2&i/i is still small, however, we can use the same approximation as in 
the previous case with the same general consequences. Of course, as 
the energy increases very considerably 2k\l\ will eventually become 
large enough to lead to oscillations in cos 2k\l\ and sin 2k'\l\ though at 
a slower rate than cos 2& ^o- At the same time the ratios k'i/k Q and 
ko/k{ will approach unity. The result is that cos W approaches 

cos W = cos 2(k[k + *oW, (105) 

with the transmission of almost all energies, the non-transmission 
bands being reduced to negligible width compared with the transmis- 
sion bands. 

The structure considered in this section is idealized to the extent 
that it extends to infinity in both directions. One might suppose that 
the situation would be different in a finite lattice which corresponds 
more exactly to an actual metal crystal. However the analysis 
indicates that the energy bands for transmission are not seriously 
affected by the change to the finite case. 

In order to apply the ideas of this section to actual metallic lattices 
it is necessary to generalize them to three dimensions. This compli- 
cates the analysis considerably and we shall not embark on it here. 
The results are similar, however, in the sense that there are certain 
energy bands for which electrons are allowed to move freely through a 
metal crystal. This does not mean, of course, that an undisturbed 
metal in equilibrium will always have some current flowing through 
it ; in equilibrium there will always be as many electrons moving on the 
average in any one direction as in the reverse direction. However the 
imposition of an electric field or a temperature gradient will upset the 
equilibrium and produce a net flow in a certain direction. The theory 
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can also distinguish between good and bad conductors. In a bad con- 
ductor we have to suppose that all the lower allowed energy bands are 
filled with their complete quota of electrons, and that this is true even 
for the valency electrons which are the fastest moving electrons in the 
atoms composing the crystal. In order to produce a current it is neces- 
sary to transfer electrons in lower energy states to higher ones across 
the gap separating the allowed bands. If these gaps are sufficiently 
great it may take very considerable energy to cross them and the sub- 
stance will then act effectively like an insulator. If on the other hand 
the energy bands available for the fastest moving electrons are not 
entirely occupied, electrons can be transferred to higher but neighbor- 
ing energies without going out of the energy band and hence with 
effectively little change in energy and the substance behaves as a 
metallic conductor. 

7. APPLICATION OF STATISTICS TO MAGNETISM 

It is well known that an external magnetic field of intensity 
Induces in a metal a magnetic moment per unit volume, i.e., an inten- 
sity of magnetization y, which for weak fields is directly proportional 
to the field intensity. We can write 

e^-Xe^ (106) 

where x is known as the magnetic susceptibility. If x > and is a 
constant independent of the field for moderate fields the metal is 
termed paramagnetic, while if x < 0, it is called diamagnetic. Metals 
possessing a finite value of e^in the absence of a field and a suscepti- 
bility which varies with the field intensity are calle d ferromagnetic. We 
shall not be concerned with this type here and in fact shall concentrate 
our attention almost exclusively on paramagnetism. 

When a magnetic field of intensity 54r changes the intensity of 
magnetization of a metal by d<$, the amount of energy required per 
unit volume is the scalar product 



(107) 

When the field changes from to <3^ the change in energy density is 
then (assuming (106)) 

- U Q = - %xje 2 . (108) 



This represents a gain or loss in energy of the metallic electrons which 
are responsible for the magnetism. For paramagnetic substances, there 
is a loss in energy in the field, the electrons tending to line up their 
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magnetic moments parallel to and in the direction of the field. Dia- 
magnetism is associated with the effect of a magnetic field on electrons 
moving in closed orbits in which the magnetic moments induced by the 
field are antiparallel to the field. 

In quantum mechanics 8 it is shown that the magnetic moment of 
the spinning electron is 

eh 

(109) 



where e is the charge on the electron in esu and c is the velocity of light 
in cm/sec. Strictly speaking to get the total magnetic moment of an 
electron in an atom one must add to (109) the magnetic moment due 
to the orbital motion of the electron. The valence electrons of metal- 
lic atoms, however, may be considered free to a first approximation and 
hence their whole magnetic moment may be taken to be that due to 
spin. We shall confine our attention to these free electrons. 

If an external magnetic field is imposed on a metal, it is assumed 
that the free electron spins are oriented either parallel or antiparallel 
to the field. Hence if E is the energy of an electron in the absence of the 
field, then its energy will be E pfflii it lines up parallel to the field 
and E + /z^fif it lines up antiparallel to the field. 

The number of electrons in the energy range (E, E + dE) in the 
absence of the field is taken to be the usual Fermi-Dirac distribution, 
viz., 9 



This follows from eq. (66) of Chapter VIII with 2r in place of r and 
a = 1. The notation otherwise remains the same as in Chapter VIII. 
We shall find it simpler to write 

N(E)dE = D(E)f(E)dE, (111) 

with 



On the average in the absence of a magnetic field half of these electrons 

8 Cf., for example, Lindsay and Margenau, "Foundations of Physics," p. 480. 

9 The treatment here follows in the main Frohlich: "Elektronen Theorie def 
Metalle," pp. 145 ff. Berlin, 1936. 
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have one type of spin and half the other. Hence when the field is 
applied the number of electrons with parallel spin is 



N p dE = /GE - J^)dE, (113) 

while the number with antiparallel spin is 

N a dE = ~^f(E + 3^)dE. (114) 



In these formulas the D(E) is left unaltered, since D(E)dE is the num- 
ber of energy values lying between E and E + dE and is unaffected 
by the field. 

Now at T = 0, all states with energy less than the maximum value 
0, which is equal to h 2 /Sm-(3N/irT) % (eq. 135 of Chapter VIII with 2r 
replacing r as usual for electrons), are occupied, while all states with 
energy greater than </> are empty. If we agree to denote the maximum 
kinetic energy of electrons with spin parallel to the field by E p and the 
maximum kinetic energy of electrons with antiparallel spin by a we 
have 

E p = <t> + n3f; E a = <t>-je. (115) 

It is clear that there is an excess of electrons with parallel spin. In fact 
we can compute the excess as (note that for T = 0, /(E) = 1) 



x 



Since < n^for magnetic field intensities less than 0.5 X 10 6 gauss 
we evaluate the integral to a high degree of approximation as 



v 'w? n/jL\ f\ 1 *~i\ 

X = /ie^T V(4>). (11 7) 

Associated with the excess electrons is a magnetic moment per unit of 
volume of magnitude 

J^=-'^D(4>). (118) 

T 

Consequently in so far as the magnetic properties of metals can be 
considered as due to free electrons, all metals should be paramagnetic 
and possess at T = a susceptibility 

X = M 2 #(0)/r. (119) 
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Actually this is not the case for many metals are diamagnetic. Never- 
theless there are some metals, e.g., the alkalies, for which the above 
assumption appears justified and which permit a comparison with 
experiment. The values are not given, of course, at T = 0, but an 
extension of the analysis given above shows that at low temperature 
there is very little variation of x with temperature and consequently 
(119) should apply fairly accurately even when T ^ 0. In utilizing 
(112) we have _ 

^ 

(120) 






/37V V* 
The substitution of < = h 2 /2m ( - ) yields finally for the sus- 

^u'fu. XoTrr/ 

ceptibility 

*W /3 NY 
X = 2 - I J 

h \7TT / 



This can be further simplified by the introduction of the value of M 
from (109), whence 

X = 1.9 X 10- (^) H , (122) 

where p is the density, A the atomic weight and n Q the number of elec- 
trons per atom. The following table gives the comparison between the 
experimental values for the alkali metals and those computed from 
(122). It should be noted that the values in the table are mass sus- 
ceptibilities and are obtained by dividing x * n (122) by the density. 
The values quoted are in multiples of 10 ~ 6 . 

TABLE OF SUSCEPTIBILITIES OF THE ALKALIES 
(Taken from Mott and Jones, "Properties of Metals and Alloys," 1936.) 





Li 


Na 


K 


Rb 


Cs 


X (computed) 


1 5 


0.68 


0.60 


0.32 


0.24 


X (observed) 


5 


0.51 


0.40 


0.07 


-0.10 


X (observed) 





0.65 


0.54 


0.21 


0.22 



Honda and Owen (1912) 
Lane (1930) 

The calculation of the diamagnetic susceptibility involves a con- 
sideration of the effect of the magnetic field on the orbital motion of the 
atomic electrons. The analysis is beyond the scope of the present 
volume. 10 

10 Reference may be made to Mott and Jones, op. cit., pp. 201 ff. 
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PROBLEMS 

1. Calculate values of the electron mean free path for sodium, copper, silver and 
gold from eq. (39) by using experimental values for the electrical conductivity at the 
temperatures -200 C, -100 C, C, 100 C, and 200 C. 

2. Calculate values of the electron mean free path for the metals listed in Prob- 
lem 1 from eq. (47) by using experimental values for the thermal conductivity. 

3. Derive the expressions for the thermal and electrical conductivities using the 
classical Maxwell-Boltzmann distribution function for/o. Show that the Wiedemann- 
Franz ratio in this case is 2& 2 / 2 , 

4. Derive the expression for the Thomson coefficient M on the basis of the classical 
Lorentz theory. Evaluate it for the case of silver. 

5. Verify eq. (69) and carry through the calculation for the thermoelectric emf 
of a copper-iron couple with junction temperatures C and 100 C respectively. 

6. Verify eq. (87). 

7. Show that if the saw-tooth potential curve in Fig. 10-3 is replaced by one in 
which the variation in potential from one region to the next is continuous and very 
gradual the corresponding infinite linear lattice passes electrons of all energies. 
(Cf. the acoustical analogy in R. B. Lindsay, J. Acous. Soc. Am., 12, 378 [1941].) 

8. Prove that the amount of energy per unit volume associated with the change of 
the intensity of magnetization of a metal by amount d<&by a field of intensity <34f is 

dU = 



CHAPTER XI 
EMISSION OF ELECTRONS FROM SURFACES 

1. SIMPLE STATISTICAL THEORY OF THERMIONIC EMISSION 

The emission of electrons from hot bodies, the so-called thermionic 
effect, is so well known and its importance in industry so well realized 
that it is unnecessary here to give a detailed description of the experi- 
mental facts. A good review of these will be found in the article by 
S. Dushman in "Reviews of Modern Physics,' 1 2, 381 (1930). The 
aim of the present discussion is the statistical derivation of the Richard- 
son equation giving the thermionic current as a function of the temper- 
ature of the emitting metal. The method used is closely allied to that 
employed in Chapter X for the study of electrical conduction in metals 
and it is assumed that the emitted electrons form in the metal a degen- 
erate gas obeying the Fermi-Dirac statistics. 

The emitting metal is assumed to be in a vacuum with the emit- 
ting surface plane and perpendicular to the x axis. We further sup- 
pose that only those electrons are able to leave the metal for which the 
velocities in the x direction exceed the critical value v xo This corre- 
sponds to the kinetic energy J^ mv 2 XOJ which we shall designate as E c . It 
presumably will be a characteristic constant for the metal. Its physical 
significance will be discussed later. The thermionic current density 
will then be given by an expression like eq. (16) of Chapter X, except 
that in the integration only the velocity components in the y and z 
directions are allowed to run from oo to + oo , whereas v x has the 
lower limit v xo . Moreover the general distribution function / is 
replaced by / , since no external field is applied. The expression for the 
thermionic current density therefore is 

2m 3 e x 

(1) 



/* 
/ 

*/vxo A 



, 

I 



The evaluation of the integral is materially simplified by the fact that 
E in the integrand can no longer take on the value zero as in eq. (16) 
in Chapter X. In fact the minimum value of E is E c . It develops that 
for temperatures actually employed in thermionic emission E c /kT 
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7i 1. Consequently the unity in the denominator of the integrand 
can safely be neglected compared with e ~'n+ E / kT an d / becomes 

/*+ /*+ /* +0 

I I I Vx e~ E/kT dv x dv y dv z . (2) 

/ oo c/ 00 *A>xo 



rl 

The integration over v y and v z is immediate from the well-known 

/+ _ 

e~ au * du = \Ar/a. The result is 

" -^-..^, (3) 

or on performing the final integration 



If we were to use the non-degenerate form for 7! (eq. 68 of Chapter 
VIII with 2r in place of r since we are dealing with electrons), the 
result would show the thermionic current density as 




Ec/kT 



On the other hand the more reasonable assumption of the degenerate 
form for y\ (cf. the first approximation in 128 of Chapter VIII with 2r 
in place of r, as usual) yields simply 

J - 



where for convenience we have set 



Both eqs. (5-6) were set forth by Richardson, 1 who made the first 
elaborate studies of the phenomenon. It proved rather difficult to 
decide between the two experimentally, since the temperature depend- 
ence enters much more critically through the exponential term than in 
the multiplicative coefficient. Nevertheless (6) is the form which has 
come to be considered the better representation of the effect. 

We must, however, make clear the significance of the quantity 
<t>. At first its presence might seem rather anomalous since E c has 

X O. W. Richardson, "The Emission of Electricity from Hot Bodies," Longmans 
(1921). 
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already been defined as the minimum kinetic energy for which elec- 
trons are able to leave the metal. The experimental interpretation 
of the exponent of e in the emission formula is that it represents (when 
its sign is changed and it is multiplied by kT) the work function or 
energy necessary to get an electron out of the metal. Now clearly E c 
would actually represent this quantity only if all the emitted electrons 
were originally at rest. However it is most natural to assume that the 
electrons which get out are those with greatest kinetic energy and the 
work to get one of them out will be the difference between E c and this 
maximum kinetic energy. We have already shown (eq. 135 of Chapter 
VIII, with 2r in place of r) that the maximum kinetic energy in a 
degenerate electron gas is 



But the comparison between (7) and (7') shows us that = max and 
our assumption then appears to be justified. The exponent (E c - <) 
1/kT in the emission law is consistent with the hypothesis that it is the 
fastest moving electrons which leave at any given temperature. 

In the derivation leading to (6) we have assumed that (E c <t>)/kT 
is large compared with unity. It is now necessary to justify this. 
We have just seen that E c </> represents the work necessary to 
get one of the fastest moving electrons out of the metal, i.e., it is 
the work function. Measurement indicates that in ordinary metals 
this is of the order of magnitude of several electron volts, but kT = 
1.37 X 10~ 16 T ergs and, even for T around 1000 K, is a small 
fraction of an electron volt. Consequently our assumption is vali- 
dated. Since %mvl > $, it follows that v xo > VQ where VQ = h/m- 
(3JV/87rr) H (cf. 139 of Chapter VIII) is the velocity at which the 
distribution function / drops to the value J^. Consequently v xo is 
farther along on the distribution curve (Fig. 8 2) than VQ and corre- 
sponds to a part of the curve where the latter is approximately exponen- 
tial. For this reason we see that the distribution for those electrons 
which are concerned in thermionic emission is approximately Max- 
wellian. 

When we consider the exact experimental verification of (6) we meet 
an interesting situation. Evaluation of the coefficient 4:Tremk 2 /h? in 
formula (6) gives 120.4 amperes/cm 2 (degree C) 2 approximately. 
This is just about double the experimental value for a number of pure 
metals, namely Ca, Mo, Ta, Th, and W. This discrepancy can be 
accounted for by assuming that at the surface of the metal there exists 
a potential barrier at which a certain percentage of electrons will be 



280 EMISSION OF ELECTRONS FROM SURFACES [Cn. XI 

reflected even when their kinetic energy is greater than the potential 
energy associated with the barrier. Such a situation is not possible in 
classical mechanics but proves to be realizable in quantum mechanics. 
To calculate the reflection coefficient it is necessary to review the 
quantum mechanical theory of the transmission of electrons through a 
potential barrier. 

2. TRANSMISSION OF ELECTRONS THROUGH A POTENTIAL BARRIER 
AND APPLICATION TO THERMIONIC EMISSION 

We shall assume as before that the plane emitting surface of the 
metal is perpendicular to the x axis. The Schrodinger equation 
(cf. 3 and 52 of Chapter VIII) for an electron with three degrees of 
freedom moving in a field of force which is a function of x only and is 
characterized by a potential energy function V(x) is 

V V + ^ ( - K<*))# = 0. (8) 

h 

We try a solution in the form 

*(x, y, z) = t(x)e- i( k *+""\ (9) 

where 

o OTT mEu o OTT mEz 
I,* . /,- ~ fic\\ 

Ky 2 RZ 7 2 * \*^/ 

Here we have set 

77 _|_ 7? _l_ /? (il\ 

J^x ^T J^y r E'zi \ 1A / 

and think of E x as the kinetic energy associated with the x component 
velocity, etc. If we substitute (9) into (8) the result is 

+ ^ (E x - V(x))t(x) = (12) 

for the determination of $(x). 

The question now arises: How are we to represent the function 
Y=y V(x)? Since we do not know the 
""""""""'" precise variation in potential at the 
boundary of the metal we make the 
simplifying assumption that V(x) = 



I 



inside the metal, while V(x) = 



FIG. 11-1. everywhere outside the metal. This 

corresponds to the discontinuous 

jump in potential energy schematically sketched in Fig. 11-1. The 
region I to the left of the origin corresponds to the metal and the 
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region II to the right of the origin corresponds to the vacuum outside 
the metal. The boundary is chosen at x = 0. The Schrodinger equa- 
tions for regions I and II then become respectively 

*+***-<>. <" 

0, (II) (M) 



where E x is of course positive. The general solution of (13) is (cf. eqs. 
55 and 56 of Chapter VIII) 

h = A*-'** + B ie ik , (15) 

with k\ &TT 2 mE x /h 2 . If ^i is assumed to be a harmonic function of 
the time we can interpret A\e~ lklX as a plane harmonic wave pro- 
gressing from left to right while Bie tklX corresponds to a similar wave 
in the opposite direction. This interpretation makes for greater 
picturesqueness in the discussion. The solution of (14) depends on the 
relation between E x and VQ. Let us first assume that E x > F . The 
general solution of (14) then becomes 

* 2 = Atf~ lk * + B 2 e** x , (16) 

with k\ = 8ir 2 m(E x F )//z 2 , where we shall employ the wave inter- 
pretation as in (15). The functions \[/i and \l/ 2 must satisfy the bound- 
ary conditions 

(17) 



) 



In other words there must be continuity in both the \f/ function and its 
gradient in crossing the boundary. The use of (15) and (16) in (17) 
and (18) yields 

A l + B l = A 2 + B 2 (19) 

Ai - B l = ^ (A 2 - B 2 ). (20) 

KI 

If we assume that the region II extends indefinitely to the right there 
will be no ^ wave function corresponding to motion in the negative x 
direction in II. We must therefore take B 2 = 0. With this choice the 
solution of (19) and (20) leads to the ratio of reflected to incident charge 
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density (cf. the physical meaning of the ^ function, Sec. 1, Chapter 
VIII) 



while the corresponding ratio of transmitted to incident density is 
M* M.|* 4*J 



(22) 



To get the relative average rates of flow of charge we must multiply 
the charge density in each case by the electron velocity. To the left 
of the boundary this is 



whereas to the right of the boundary 



'-">. W (24) 



W 

If we define the transmission coefficient D as the ratio of the average 
rate of flow of charge away from the boundary in II to the average 
rate of flow up to the boundary in I, we obtain 

"-Tr^r <*> 



Similarly the reflection coefficient R is 



We see that R + D = 1, as should be expected. It will be observed 
that unless ki = &2 i- e - VQ 0, there is always some reflection, which 
increases in amount as VQ increases. In fact as VQ > E, & 2 * and 
R >1. The distinction between classical theory and quantum me- 
chanics is well illustrated by the fact that whereas on classical mechan- 
ics all electrons with energy larger than VQ should be able to climb to 
the VQ level or plateau, quantum mechanics predicts some reflection 
even in this case, a reflection that, to be sure, decreases as rapidly as 
E VQ increases. 
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It is left as a problem to the reader to solve the transmission prob- 
lem for E < VQ and to show that 



However, the velocity in II now becomes imaginary and hence there 
can be no average flow of charge through II. Indeed the wave function 
in II is an exponential function of x with a negative exponent. 

We are now ready to apply the theory of this section to thermionic 
emission. The fundamental eq. (1) must be rewritten to take account 
of the transmission coefficient which is a function of v x * The thermi- 
onic current density now becomes 

2m 3 e C+* C+* f + " v x D(v x ) 

J = ~rr III i , ,-yi+B/kT dv * dv y dv * (28) 
n J oo */-oo Jv X o ' e 

where D(v x ) is given by eq. (25). If we employ the same type of 
approximation used in the first part of the section, / becomes 

e-^ dE x . (29) 



The transformation u = (E x E c )/kT changes this to 

j = _ e (<t>-Ec)/ . I D(u)e~ u du. (30) 

Now if F = E c , eq. (25) yields 



C + ukT 

D(u) = . , =- (31) 

(VukT + VE C + ukT] 2 J 

From what has been said previously kT/E c 1. For values of 
u > E c /kT we get very little contribution to the integral in (30) 
because of the exponential factor e~ u . Therefore the evaluation can 
best proceed by expanding in powers of vukT/E c . We have 

/"n^ - 7 A C" e ~ U V / ^kT[l + "kT/2E c + .*,]du 

I D(u)e u du = 4: I ^-^ == (32) 

Jo Jo VE C [VukT/E c + Vi+ ukT/E c ] 2 

After a little reduction this becomes 



V(l-: 
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rxoo 
\/u e~~ u du = \/~7r/2, / ue~ u du = 1 and 
/o 

~ u du = 3\/7r/4:, we can finally write to the indicated approxi- 



f 

JQ 



/o 
mation from (30) 



(33) 



The factor 4VkT/E c - [Vw/2 - 2VkT/E c - ] can be considered as 
an effective average transmission coefficient and denoted by D(u). If 
the temperature is 1000 K, kT is approximately 0.1 electron-volt and 
if E c = 10 electron-volts, kT/E c = 0.01. Substitution indicates that 
hereD(u) = 0.27 approximately. While this is not equal to the 0.50 
apparently demanded in order to make the theoretical / agree with 
experiment for a number of metals, the order of magnitude agreement 
is sufficient to indicate that the method of explanation is at any rate on 
the right track. It is very unlikely that the abrupt potential barrier 
visualized in Fig. 11-1 is actually realized in fact. It is much more 
likely that the situation is that depicted in Fig. 11-2 where the transi- 

tion from V = to V = F takes 
place more or less gradually. This 
will decrease the reflection for given 
excess energy ,j.e., E x - E c , and 
hence increase D(u). However, the 
FIG. 11-2. analysis will in general be rather 

complicated. Moreover the exact 

shape of the transition curve in Fig. 11-2 is unknown anyway. How- 
ever, the value of the amplitude term in the thermionic current varies 
very widely and values higher than 60 amperes/cm 2 (degree C) 2 are 
not uncommon. 

In the preceding discussion we have neglected a force whose influ- 
ence is to round off the course of the potential energy function as 
indicated in Fig. 11-2. This is the so-called image force from electro- 
statics. It corresponds to an attraction of magnitude e 2 /4# 2 when the 
electron is at distance x from the interface of the two media. Associ- 
ated with this is a potential energy function V = 2 /4#. We shall 
consider its use in Sec. 4. 
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3. TRANSMISSION OF ELECTRONS THROUGH A POTENTIAL HILL 
AND APPLICATION TO THERMIONIC EMISSION 

It is conceivable that if the surface of a metal is coated with a film 
of some foreign substance the nature of the potential barrier will be 
materially altered from that shown in Figs. 11-1 or 11-2. Indeed it 
may look rather like Fig. 11-3, where 
the potential rises at the surface of 
the metal to V = VQ and after the 
thickness / is traversed, falls to the <-Z->l 

value V = V\. Let us compute the ^=o * II IIJ 



I ___ ___ 



electron transmission through such X = Q x 

a hump. We shall assume that FIG. 11-3. 

E x > Vi though it need not be 

greater than VQ, as will be seen. The solutions of the one-dimensional 

Schrodinger equation for the regions I, II, and III are respectively, 

with E x > Vo > V, ^ = Aie - lkix + Bi ^ 

^ = A 2 e~* + B 2 e* k * x , (34) 



1 72 nv*x j '2 *-"'"" /r -i T r \ 172 ^ ' / r< IT \ 

where k\ = -^ , k 2 2 (& x VQ), and k% = 2 (E x Vi), 

and kit k 2 , and k% are all real. The term B 3 e lk * x has been omitted for the 
usual reason. The factor (x /) is used in place of x in ^3 merely for 
convenience. The boundary conditions at x = are 

AI + BI = A 2 + B 2 , 



and at x = I, . _ t fc 2 z 

(36) 
A 2 e~** 1 - B 2 e ik * 1 = - A z . 

The problem is to eliminate jBi, A 2j and B 2 between (35) and (36) 
and obtain A% in terms of A\. We shall find it convenient to set 
k 2 /ki = k 2 i and ka/k 2 = k% 2 . Eliminating BI between the two equa- 
tions (35), we get AI in terms of A 2 and B 2 . We can solve (36) for 
A 2 and B 2 in terms of A 3 . The result is 

4: A I 
3 /< , * \/ 1 , T \ iAjZ 1/1 L \/i L \ i'Aii* W'/ 
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On expansion this can be written 






3 (1 + &21&32) cos k 2 l + i(k 2 i + & 32 ) sin k 2 l 
Therefore 

I A .12 = /IL _1_ fc-^2 i /J,2 _ 1 WJ,2 i\ .>^ 2 LI" 



(* 2 1 + M 2 + (& - 1X& - 1) COS 

To get the transmission coefficient we must introduce the electron 
velocities in regions I and III namely 

k\h 



Proceeding as in Sec. 2, and setting k^/k\ = 31 gives for the transmis- 
sion coefficient 

(40) 



(*ai + *32) 2 + (& - 1)(/| 2 - 1) cos 2 k 2 l 

Before considering DI further, let us examine the case in which 
EX > V\. Then since k% is now pure imaginary, we find it convenient 
to set #2 = ikl, 21 = ik*i> and k 32 = ik* 2 where kl, ^*i = **/*ii 
and Jfe*2 = &3/&* are all real and positive. Substitution into eq. (37) 
gives after rearrangement of terms 



(1 + ^21^32) cosh #2^ ^(^21 ^32) sinh 
This leads to 



Mil 2 (i + *S?)(i + *S) cosh 2 fe*/ - 



(42) 



The interesting thing about this result is that in spite of the fact that 
EX < VQ> there is still transmission through the potential barrier. 
Indeed the transmission coefficient is 



2 



If we resubstitute in terms of the actual energies, the result is 



4V(. - V,) 
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The transmission vanishes for E x < V\ < F , as is evident from the 
remarks made -immediately following eq. (27). However (44) indicates 
that the transmission coefficient rises from zero to a definite value as 
E x goes from V\ to VQ and thereafter increases as E x increases. A 
rough plot of the behavior of D as a function of E x is shown in Fig. 




FIG. 11-4. 



11-4. This is, of course, a composite of D 2 (from V\ to F ) and DI 
from FQ on. 

The thermionic current in this example will be evaluated from 
eq. (30) on substitution of the appropriate value of D. Thus we seek 
(placing E c = FI) 



- 

D( 



r 

u) = I 

Jo 



D(ukT + 



du. 



(45) 



This can at once be split up as follows 



with 



J(Vn- 



D ie -"du (46) 



, 



Vo(Vi-Vo) cos 2 k 2 l 
' (ukT+V^ukT+Vi-Vd 

(47) 
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while 



1 



l+cosh 



2(F -Fi-**r) 
(ukT+V,) 



r^K ^T^T^ V ^T+^ T } 



F (F 

(48) 

Now in the evaluation of the second integral in (46) u > (F V\)/kT 
or FI + ukT > VQ. Hence for most of the way from u (F FI) 
1/kT to oo , DI will approximate unity. Therefore the second integral 
becomes of the order of e ~^ VQ ~ v ^ /kT , Keeping this in mind we 
examine the first integral. By a slight rearrangement and expansion 
of the square bracket in the denominator of (48) we have 

16 V^VukT + Fi(F - FI - ukT) 

> V*") 



' F (F - FI) D 

where 

UukT(ukT + Vi) + 2(F - FI - ukT) 2 



i -sVukTVukT+ Fi(F - F! - ukT)) 
F (F - F x ) 

Evidently the dominant term in D in the range of u involved in the 
first integral is the second. If we neglect the rest we obtain 

r 
D 2 e- u du 

16 /-(Fi-F )/*:r 



c r(Vi-v Q )/kT 

I e~"V(ukT)(ukT + 

FI) JQ 



FO(F O 



After some expansions based on the smallness of ukT/V\, etc., we 
finally arrive at the approximate result that the above integral is 

(51) 
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This leads to 

KA/TrA/F, fcT _../= =-./^ rrz 

a term of order 

(52) 

Now for much of the thermionic range we may neglect the second term 
compared with the first. The thermionic current density can then be 
put into the form 

J = AT 2 e~ bQ/T , (53) 

with 

and 

R- fh 

(55) 

In a general way (53) agrees with experiment. For details the reader 
should consult Dushman's article on Thermionic Emission previously 
referred to, particularly pp. 468 ff. 

4. EFFECT OF STRONG FIELDS ON EMISSION 

In our discussion of thermionic emission we have neglected the 
effect of the electric field actually used to get the electrons away from 
the surface, assuming with apparent success that its influence is very 
small. However, as the external field strength is increased it becomes a 
significant factor. Suppose the intensity of the applied field is F 
(assumed uniform), corresponding to a negative potential energy 
eFx. Taking the boundary at x = 0, as usual, and including the 
image force potential energy (already mentioned at the end of Sec. 2), 
the total potential energy in the neighborhood of the boundary may 
be written in the form 

V = Fo - e - eFx. (56) 

Here VQ is the asymptotic height of the potential wall outside the 
metal in the absence of an external field. 2 The effect of the external 
field is to lower the potential below this asymptotic level, as indicated 

2 The formula (56) obviously cannot apply right at the boundary, x = 0. We 
shall assume that it holds to within 10A or so of the boundary. 
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in Fig. 11-5. The first effect is to reduce the effective jump in potential 
across the boundary. The maximum value_pf V in (56) occurs at Xi = 
to F max = V - e % \/J. If we could assume that 

V-V 




FIG. 11.5. 

the principal effect of the field really is to lower the effective V by the 
amount e^\/F, i.e., neglect the portion of the V curve beyond #1, the 
sole correction to the emission formula (33) would be multiplication 
by the factor e^ e /kr . Thus (33) would become 



(57) 



corresponding to increased transmission with increasing field intensity. 
The experiments of de Bruyne 3 agree rather well with (57). 

As the value of F is increased to about 10 6 volts/cm, the dip of the 




FIG. 11-6. 

potential curve near the maximum becomes so great that the above 
approximation no longer holds. The situation is approximately 
depicted in Fig. 11 6 in which the effect of the image force, leading to 

Proc. Roy. Soc. 120, 423 (1928). 



SEC. 4] EFFECT OF STRONG FIELDS ON EMISSION 291 

the rounded off dotted curve, may be neglected and for the sake of 
simplicity the potential course may be represented by the heavy curve. 
This is intended to approximate the situation at absolute zero. To 
the left of the barrier the Schrodinger equation for the x dependent part 
of the function is 



and to the right of the barrier 

(E x - 7 + eFx)t = 0. (59) 



The complete solution of the transmission problem has been given by 
Fowler and Nordheim 4 and we shall merely quote the result for the 
transmission coefficient (NOTE: E x < FO), which is 



By referring to eqs. (28, 29, and 30) we can now write for the electron 
current density 



where our previous formulas are modified to a certain extent since the 
energy of the electrons cannot exceed the maximum energy < = yikT, 
corresponding to low temperatures. In spite of this limitation (61) 
indicates that a non-vanishing current exists. It should be pointed 
out that in (61) rj = m(v^ + vl}/2kT. The evaluation of / depends 
first on the integral 



r___ 

fc/O ^ 



Since E x < <, (E x <t>)/kT is large negatively. Consequently we may 
evaluate the integral from (114) of Chapter VIII using p = 0. It will 
be sufficient to take only the first term in the result (eq. 122 of Chapter 
VIII) and write 

- E X 



r * 

7 e'+i*-/* r + 1 \ 



kT 
*Proc. Roy. Soc. 119, 173 (1928). 
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Substitution into (61) with the use of (60) yields 

x . (63) 

To evaluate the integral, let <t> E x = and expand (Vo <t> + )** m 
powers of . For convenience represent V 8?r 2 w/A 2 by /*. The integral 
then becomes 




. 
/o 



We expand the radical in powers of . The resulting integration is 
simple and the final result becomes 



The experimentally observed emission agrees in general with (64) 
being indeed of the form 

J = KF 2 e~ a/F , (65) 

where K and a are constants. The exponential dependence on F is 
particularly well substantiated by experiment. 5 

5. THE PHOTOELECTRIC EFFECT 

An electron emission problem closely allied to thermionic emission 
is the photoelectric effect. When light falls on a cold metal surface, 
electrons are emitted if the frequency of the light exceeds a certain 
threshold frequency PQ. We can understand this situation in terms of 
the preceding sections of this chapter, since when the metal is at a 
temperature near absolute zero the maximum kinetic energy of the 
electrons is about equal to < (Sec. 1). Consequently in order that an 
electron with energy E less than < shall get over the barrier E c = V Q 
it must get from the incident light quantum the energy hv such that 

hp + E = E c . (66) 

The least frequency which will satisfy this condition is therefore P O , 
where hv$ + <t> = E c or 

PQ = (E c <t>)/h. (67) 



6 See Fowler, " Statistical Mechanics," second edition, p. 357 and accompanying 
references. 



SEC. 5] 



THE PHOTOELECTRIC EFFECT 



293 



The photoelectric threshold thus proves to be vitally related to the 
thermionic work function. This relation is well substantiated by the 
experimental values for many metals as the accompanying table shows. 
In considering the comparison we must, of course, recall that the 
threshold is not usually measured at T = and there is bound to be 
some temperature effect. 

TABLE OF THERMIONIC WORK FUNCTIONS AND PHOTOELECTRIC THRESHOLDS 

(From Fowler, " Statistical Mechanics ") 

The values are in electron volts 



Metal 


Cs 


Ta 


Mo 


W 


Ni 


Pel 


hvQ 


1 9 


4 11 


4 15 


4 54 


5 01 


4 97 


(E c - 0) 


1 81 


4 12 


4 15 


4 54 


5 03 


4.99 



We shall give an elementary analysis of the emission based on the 
assumption that the photoelectric current density per unit light 
intensity is simply proportional to the number of electrons incident per 
second normally on the surface of the metal for which 

E x + hv> E c , 

where E x is the kinetic energy associated with the direction normal to 
the metal surface taken as the x axis. Now the number of electrons 
with x component velocity lying between v x and v x + dv x striking unit 
area of surface per second is (from Sec. 1, eq. 1) 



00 / 00 1 



vy dv z 



-n+E/kT 



(68) 



There is a certain advantage in transforming from v x , v v , v z to cylin- 
drical coordinates v x , p and $, where 



p 2 = i) - 

Then (68) becomes 

dJ = -g v x dv x 

The integration yields 
dJ = 



pdp 



v x log 



) dv x . 



(69) 



(70) 



(71) 
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The photoelectric current density per unit light intensity is then 

(72) 

where K is a constant proportionality factor. Introduce the trans- 
formation w = e ( *~ E * )/kT and call ^-<*c-A,)]/*r = ^ . The current 
density takes the form 




J 
I = - ^ - / - dw. (73) 

V y 



W 



The integral is a well-known one. We distinguish between two cases, 
i.e., (a) v < VQ for which WQ < 1, and (b) v > v , for which WQ > 1. 

hvQ hv 

For (a) we have, setting = a, 

kl 



~ 2a 



For (b) we write 



/""log (!+>)., 7" 1 log (1 + w) /^*log(l+>) . 

I - aw = I - aw + I - aw 
Jo w Jo w Ji w 



whence for v > 

/ = *.^=^ ( :- + __^_:_ + ^...jj. (75) 

We thus see that the photoelectric current density / is given in the 
general form j 

JT 2 = 4K) (76) 

where A is the constant K'4wmek 2 /h 3 and f(a) is a function to be 
evaluated from the previous equations (74) and (75). We note at once 
the interesting fact that if v = 0, so that a = hi> Q /kT = (E c - <f>)/kT, 
the equation (76) reduces to a close approximation to 



/ = AT 2 e- (Ec ~, (77) 

which is just the thermionic emission equation (6) with the exception 
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of the multiplicative factor K. The higher terms in the expansion 



(74) i.e., e Za , e da , etc., are all negligible compared with e a in this 
case. 

Following DuBridge 6 we examine the form of eq. (76) when the 
surface of the metal is at T = 0. Here the argument of the function 
/ is + or oo according as v < VQ or v > VQ. Consequently we get 
the alternative results 



= for v <> 



(78) 



(hv - hv Q ) 2 for v 



If then we plot the photoelectric current density as a function of fre- 
quency, we find no emission out to the threshold frequency v$\ there- 
after the emission is a quadratic function of the frequency. The curve 
is indeed a parabola with vertex at v = j> as indicated in the accom- 
panying figure (Fig. 11-7). For T > 0, the plot of (76) using (74) 




and (75) yields curves of somewhat similar shape which, however, 
do not touch the axis at precisely VQ. It is clear that the threshold 
frequency has a precise meaning only at absolute zero. At higher 
temperatures there is no absolutely sharp threshold, as the current- 
frequency curves approach the axis more gradually. Nevertheless it 
is convenient to continue to look upon *> = (E c </>) A as the thresh- 
old, as it actually would be if we could lower the temperature of the 
surface to K without altering E c and <. We are justified in looking 
upon VQ as an important characteristic of the surface. 

Let us go back to eq. (76) and again setting (hv Q hv)/kT = a, 

6 L. A. DuBridge, "New Theories of the Photoelectric Effect," Actualit6s 
Scientifique, 268, Paris, 1935. 
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take the logarithm of both sides (base 10 is usually the more conveni- 
ent). Thus 

logio(//r 2 ) = B + <D(a) (79) 

where B = logi^A (a constant independent of v and T) and <i> (a) = 
logio/Xa)- If logio(I/T 2 ) is plotted against a, the resulting curve will be 
independent of the metal and the temperature except for an additive 
constant. In other words the curve should be superposable on the 
curve obtained by plotting 3>(a) against a by a shift along the ordinate 
axis of amount B. The form of < (a) is shown approximately in Fig. 
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FIG. 11-8. 



11-8. Unfortunately a is known for given frequency only when VQ 
is known and this depends on the metal. Fowler therefore suggested 
the plotting of the experimental values of log (I/T 2 ) against hv/kT. 
This curve should agree with the theoretical curve (79) by a backward 
shift along the axis of amount hv Q /kT and a vertical shift of B. By 
ascertaining the horizontal shift, Fowler was able to obtain theoret- 
ically good values of P O . For more complete discussion of the agree- 
ment with experiment DuBridge's paper should be consulted. 

We next consider the energy distribution of the emitted photo- 
electrons and examine the case of those electrons which emerge nor- 
mally to the metallic surface. Experimentally the distribution is 
measured by applying a retarding potential between the emitting 
surface and the collecting electrode and then measuring the photo- 
electric current as a function of this potential. The number of photo- 
electrons emerging per second normally from the surface against a 
retarding potential V may be obtained by a slight modification of eq. 
(72). Thus we divide by the charge e and in the lower limit of the 
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integral employ E c + Ve in place of E c . The number in question as a 
function of V becomes 



N( V} _ 



" 



r 

JEc+Ve- 



If we call the integrand n(E), i.e., 

n(E x ) = 
it follows that 

n(E c -hv + Ve) = 



h 3 

K-^rnkT 
h* 



log (1 + 
log (1 + < 



and this is the number of photoelectrons emitted per second by light of 
frequency v per unit energy interval with the energy E = Ve 
A0> ~ v ) + <t> = Fe + EC />. It is called the "normal" energy 
distribution function. At T = 0, it reduces to 



Ve) = 



(82) 



When plotted as a function of Ve h(i> ~^o) ^o is a straight line as 
in Fig. 11-9, intersecting the axis of abscissas at h(v j> ) = Ve. When 
T 7* 0, the curve tails off as shown by the dotted line in Fig. 11-9. The 
general agreement with experiment is not too good, probably because 




o Ve-h(v-v ) 
FIG. 11-9. 

of the neglect to take account of a transmission coefficient as we did in 
the treatment of thermionic emission. For a complete analysis of the 
experimental data, consult DuBridge. The result is that in the vicinity 
of the energy h(v PO), the experiments check the theory rather well. 
Further investigation of the photoelectric effect would take us too 
far afield. The reader who is interested in the total energy distribu- 
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tion of the photoelectrons, i.e., the distribution taking account of all 
the velocity components, will find a thorough discussion in DuBridge's 
article. 

6. CONTACT POTENTIAL 

In the immediate neighborhood of the surface of a metal in a 
vacuum there will be at every temperature an atmosphere of electrons 
in equilibrium with the metal. The electrostatic potential of the 
field produced by this electron gas will be equal to that at the surface 
of the metal; but the latter is a function of the metal. Consequently 
we expect that when two different metals are very close together in a 




FIG. 11-10. 

vacuum, a potential difference will be set up between them. This is 
the so-called contact potential. We can obtain an interesting relation 
between the contact potential and the thermionic work functions for 
the metals by the following simple considerations. 

In Fig. 11- 10 we represent schematically the two metals 1 and 2 
separated by a vacuum. The temperature is kept constant at the value 
T. The average number of free electrons per unit volume of metal 1 at 
the surface is given by eq. (171) of Chapter VIII 



(83) 



where FI is the electrical potential just inside the surface and eV\ is 
the potential energy of the electrons at this place. (We are here, of 
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course, treating e as a positive quantity.) Similarly in metal 2 the 
average density of free electrons is 



But from our study of thermionic emission we know that 

71 + IT = IT (85) 

and 

yi+ ltf = kT (86) 

where 0i and 02 are the maximum kinetic energies of the electrons in 
the two metals respectively. In Sec. 1 of this chapter we used = y\kT 
to denote the maximum kinetic energy of the free electrons of the 
metal. However if the electrons are in a force field and possess poten- 
tial energy also, an extension of the analysis of Sec. 7 of Chapter VIII 
shows that it is jikT + eV\ which corresponds to the maximum 
kinetic energy. Hence (85) and (86) are justified. 0i and 0g are 
sometimes referred to as the inner work functions. From (85) and 
(86), we have 

e(Vi - V 2 ) = (0i - 2 ). (87) 

Now V\ V 2 can be written in the form 

V l - V 2 = Fi - F 3 + F 3 - F 4 + F 4 - F 2 , (88) 

where F 3 is the electrical potential in the vacuum at the point 3 just 
outside the metal 1, while F 4 is the potential in the vacuum at the 
point 4 just outside the metal 2. The quantity e(Va FI) is the 
minimum kinetic energy for those electrons in 1 which are able to cross 
the surface. If we use the notation of Sec. 1 we should call this (E c )i* 
Similarly 

Consequently 

. i /Z7\ /Z7\ I / J7 IT \ ~ 

*rl ~~~ *r2 == v^-'C/l """" \~c)2 "T" \v 3 ^"* * 4/ 6 

or slightly rewritten 

e(V 3 - F 4 ) = [(E c ) 2 - 2 ] - [(E c )i ~ 0i]. (89) 



Now (E c ) i 0i is the thermionic work function for the first 

(E c ) 2 02 that for the second, while (F 3 4) is the differe*^j in 
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electrical potential between two points just outside each metallic 
surface respectively. It is by definition the contact potential. The 
relation (89) connects the contact potential between two metals with 
their thermionic work functions. It has been verified experimentally 
for a number of cases, though exact measurements of contact potentials 
are difficult to carry out. For a discussion of the experimental agree- 
ment Fowler's "Statistical Mechanics," p. 364, may be consulted. 

PROBLEMS 

1. Evaluate the thermionic current density for molybdenum at T = 1,000 K, 
1,500 K, and 2,000 K using both forms of the Richardson equation, i.e., (5) and (6). 
Take E c = 4.08 electron volts. Compare the results with those obtained using 
the experimental data listed in S. Dushman, Rev. Modern Phys. t 2, 381, 1930, p. 394. 

2. Discuss the transmission of electrons through a boundary barrier in which the 
potential rises from V = to V = VQ continuously and very slowly with a con- 
tinuous gradient. Show that in this case there is complete transmission for all 
energies. (Cf. the elastic wave analogy in R. B. Lindsay, /. Acous. Soc. Am., 
12, 378 [1941].) 

3. Solve the transmission problem of Sec. 2 for E < VQ and show that eq. (27) 
results. 

__ 4. Find the value of E c which makes the effective average transmission coefficient 
' 0.50 for T 1,000 K. 

5. An image force potential barrier may be represented by the equations 

V = - VQ , x < 



where x represents the boundary. Find the expression for the average transmis- 
sion coefficient in this case. 

6. Fora monatomic layer of thorium on tungsten assume that VQ 10.3 electron 
volts and Vi = 8.4 electron volts. Take / = 2.85 X 10" 8 cm. Compute the 
effective average transmission coefficient. 

7. Plot the photoelectric current density per unit light intensity (divided by the 
constant K; cf. Sec. 5) for the metals tungsten and nickel at T = 500 K and 
1,000 K respectively. 

8. Plot the normal energy distribution function of the photoelectrons from 
potassium at 20 C, i.e., eq. (81). 
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